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OM nucleic - nucleic search, using sw model 



Run on: 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



January 6, 2 004, 01:16:12 ; Search time 94 Seconds 

( wi thout a 1 ignment s ) 
4761.303 Million cell updates/sec 

US-10-088-872-1 
1014 

1 atgaaaaaaatgcctttgtt tgaagaaaacggccccttga 1014 

I DENT I T Y_NUC 

Gapop 10.0 , Gapext 1.0 



Searched: 569978 seqs, 220691566 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



1139956 



Database 



Issued_Patents_NA: * 

/cgn2_6/ptodata/l/ina/5A__COMB. seq: * 
/cgn2_6/ptodata/ 1 / ina/ 5B_COMB . seq : * 
/cgn2_6/ptodata/l/ina/6A_COMB. seq: * 
/cgn2_6/ptodata/l/ina/6B_COMB. seq: * 
/ cgn2_6 /p t oda t a / 1 / ina / PCTUS_COMB .seq: 
/cgn2_6/ptodata/l/ina/backf ilesl . seq: 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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10313, A 


c 


42 


33 


.2 


3 


3 


813 


4 


us- 
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Sequence 
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Sequence 
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ALIGNMENTS 



RESULT 1 

US-09-620-312D-111 

Sequence 111, Application US/09620312D 
Patent No. 6569662 
GENERAL INFORMATION: 



APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 



Tang, Y. Tom 
Liu, Chenghua 
Asundi , Vinod 
Zhang, Jie 
Ren, Feiyan 
Chen , Ru i - hong 
Zhao, Qing A. 
Wehrman , Tom 
Xue, Aidong J . 
Yang , Yonghong 
Wang, Jian-Rui 
Zhou, Ping 
Ma, Yunqing 



APPLICANT: Wang, Dunrui 
APPLICANT: Wang, Zhiwei 
APPLICANT: John Tillinghast 
APPLICANT: Drmanac, Radoje T. 

TITLE OF INVENTION: No. 6569662el Nucleic Acids and 
TITLE OF INVENTION: Polypeptides 
FILE REFERENCE: 784CIP2B 

CURRENT APPLICATION NUMBER: US/09/62 0 , 3 12D 
CURRENT FILING DATE: 2000-07-19 
PRIOR APPLICATION NUMBER: 09/552,317 
PRIOR FILING DATE: 2000-04-25 
PRIOR APPLICATION NUMBER: 09/488,725 
PRIOR FILING DATE: 2000-01-21 
NUMBER OF SEQ ID NOS : 1105 
SOFTWARE: pt_FL_genes Version 1.0 
T^SEQ ID NO 111 
\ LENGTH: 14 21 
TYPE: DNA 

ORGANISM: Homo sapiens 
FEATURE : 
NAME /KEY: CDS 
LOCATION: (217) . . (1230) 
US-09-620-312D-111 



Query Match 100.0%; Score 1014; DB 4; Length 1421; 

Best Local Similarity 100.0%; Pred. No. 4.8e-292; 

Matches 1014; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

/TGAAAAAAATGCCTTTGTTTAGTAAATCACACAAA 6 0 

1 1 1 1 M I IM 1 1 1 M 1 1 1 II i 1 1 1 1 1 M i 1 1 1 1 ! 1 1 1 1 1 M 1 1 1 II 1 1 1 1 1 II 1 1 IN 



1 1 II III III 1 1 II I II 1 1 III 1 1 II II 1 1 Mill 1 1 II 1 1 III I II III II MM II I 



GAAGTGTCTAAATCACTGCAAGCAATGAAAGAAATTCTGTGTGGTACAAACGAGAAAGAA 18 0 

1 1 II II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II II M 1 1 1 1 1 1 1 II 1 1 1 II I Ml 1 1 II I II 1 1 1 

GAAGTGTCTAAATCACTGCAAGCAATGAAAGAAATTCTGTGTGGTACAAACGAGAAAGAA 396 



Ml II 1 1 1 1 IMI II I i 1 1 1 1 1 II II II 1 1 1 1 1 1 1 II 1 1 1 II I II III II I II 1 1 1 



ACACTGATAGCTGACCTGCAGCTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATA 300 

1 1 II II I II I II I II Mill 1 1 1 1 MM 1 1 II 1 1 1 1 1 II II 1 1 1 1 II I II III 1 1 1 1 1 1 

ACACTGATAG CTGAC CTG CAGCTGATAGA CTTTGAGGGAAAAAAAGATGTGA C CCAGATA 516 
TTTAACAACATCTTGAGAAGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGT 360 

1 1 IMI I II I II III III MM II MM I MM I II Mill MM II III III 1 1 1 1 1 1 

TTTAACAACATCTTGAGAAGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGT 576 
GCTCATCCTCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTA 42 0 

I II I II 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II 1 1 1 II 1 1 II II 1 1 1 1 1 M 1 1 II II 

GCTCATCCTCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTA 63 6 



Qy 


1 


Db 
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Qy 


61 


Db 
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Qy 
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Db 


337 


Qy 
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Db 
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Qy 
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Db 


457 


Qy 
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Db 


517 


Qy 


361 


Db 


577 



Qy 



421 CGTTGTGGGATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTC 4 80 



1 1 1 1 Ml 1 1 1 II Ml 1 1 1 ill I !! Ml i 1 1 II I IM 1 1 1 1 1 1 II II MM II 1 1 Ml I 

CGTTGTGGGATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTC 696 



Db 
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CGTTGTGGGATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTC 


696 


Qy 


481 


TTTTCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCT 


540 






1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 
1 1 M 1 1 1 1 1 1 1 II II II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 M I II 1 1 1 1 1 1 II 




Db 
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TTTTCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCT 


756 


Qy 


541 


TCA.GATGCCTTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGAC 


600 






1 1 1 1 1 1 1 1 1 1 1 1 M ! 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 M II 1 1 1 M 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 




Db 
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TCAGATGCCTTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGAC 


816 


Qy 


601 


TTCTTAGAACAAAATTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAG 


660 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
1 i 11 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 
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TTCTTAGAACAAAATTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAG 


876 


Qy 


661 


AATTATGTTACTAAGAGACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCAC 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 


720 


Db 


877 


1 II 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

AATTATGTTACTAAGAGACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCAC 


936 


Qy 


721 


AACTTTG C CAT CATGACAAAGTATATCAG CAAG CCGGAGAACCTGAAACT CATGATGAAC 


780 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
1 II 1 1 1 1 1 1 1 1 11 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 
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AACTTTGCCATCATGACAAAGTATATCAGCAAGCCGGAGAACCTGAAACTCATGATGAAC 


996 


Qy 


781 


CTCCTTCGGGATAAAAGTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTT 


840 






1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 




Db 


997 


CTCCTTCGGGATAAAAGTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTT 


1056 


Qy 


841 


GTGGCCAGTCCTCACAAAACACAGCCTATTGTGGAGATCCTGTTAAAAAATCAGCCCAAA 


900 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 
1 1 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1057 


GTGGCCAGTCCTCACAAAACACAGCCTATTGTGGAGATCCTGTTAAAAAATCAGCCCAAA 


1116 


Ov 


901 


CTCATTGAGTTTCTGAG^nrTTrrAAAAAnAAAnnArnnATnATnAnrAnTTrrirTnAr 1 


y u u 






II 1 II M 1 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 II M 1 1 II 1 1 1 1 1 1 II 1 II 1 1! 1 1 1 MM 




Db 


1117 


PT PA T TP A PTTT PTP A G P A P PTT P P A A A A A P A A A PP A PPP A TP A Tn A P P A HTT PP PTP A P 


1 1 76 


Qy 


961 


GAGAAGAACTACTTGATTAAACAGATCCGAGACTTGAAGAAAACGGCCCCTTGA 1014 








Mill:,! 1. II II II II II II 1 1 1 1 




Db 


1177 


GAGAAGAACTACTTGATTAAACAGATCCGAGACTTGAAGAAAACGGCCCCTTGA 1230 




RESULT 2 








US-09-190 


-965- 


-2 





Sequence 2, Application US/09190965 
Patent No. 6071721 
GENERAL INFORMATION: 

APPLICANT: Tang, Y. Tom 

APPLICANT: Guegler, Karl J. 

APPLICANT: Corley, Neil C. 

APPLICANT: Gorgone, Gina A. 

TITLE OF INVENTION: CALCIUM BINDING PROTEIN 
FILE REFERENCE: PF-0635 US 

CURRENT APPLICATION NUMBER: US/09/ 1 90 , 965 
CURRENT FILING DATE: 1998-11-13 
NUMBER OF SEQ ID NOS : 5 
SOFTWARE: PERL Program 
SEQ ID NO 2 
LENGTH: 1344 



TYPE: DNA 

ORGANISM: Homo sapiens 
FEATURE : - 

OTHER INFORMATION: 3734805 
US-09-190-965-2 



Query Match 99.7%; Score 1010.8; DB 3; Length 1344; 

Best Local Similarity 99.8%; Pred. No. 4.2e-291; 

Matches 1012; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 



Ov 

vy 


i 

-L 


A TP A A A A A A A TPPPTTTPTTTA PTA A A TTA raPBSaa ATPPA^fAP A A ATTPTPA AAA TP 
-rt. 1 ^/l/liAi-i/^i-iu^i 1 LjLL 1 l ILjJL 1 li-iL IH/iAl LiAL^LA/^A/l^ 1 LLALrLALiAAA 1 1 Lj 1 LAAAA 1 L 


b U 






1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 II II 1 1 1 1 

1 1 1 li 1 1 1 II 1 1 1 1 1 1 II I 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 II II II 1 1 1 




Db 


124 


ATGAAAAAAATGCCTTTGTTTAGTAAATCACA.CA 


183 


Ov 
vy 


61 


PTPA A AP A PA A TTTPPPP A TTTTHflA A A APP A AHA HA A A A A PA PAPA HA AHH fTTPAP A a 


1 OA 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 
1 M 1 M 1 II 1 I 1 1 1 1 1 1 1 1 1 1 1 | | | | I | | | 1 1 1 | | | | | | | | | | | | | | 1 | | | | | | | | | | 1 | 




Db 


184 


CTGAAAGACAATTTGGCCATTTTGGAAAAGCAAGACAAAAAG^ 


243 


Ov 

vy 


i y i 

i£l 1 


pa aptptpta a atpaptppa appa ATnaa aha a attptptptpptapa a appapa a apa a 

yjJr\f\Kj luiLi MJ-vt\ l 1 ov_^i-UAoL-tti-i 1 1 ILlulul oo 1 1\ LMAA L L AL AAAL AA 


ion 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II I || | I I I I I I 




Db 


244 


GAAGTGTCTAAATCACTGCAAGCAATGAAAGAAATTCTGTGTGGTACAAACGAGAAAGAA 


303 


Ov 
vy 


181 


PPPPPAAPAPAAPPAPTPPPTPAPPTAPPAPA APA APTPTAPAPPAPTPPPPTPPTAPTP 


O A Pi 
Z 4fc U 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 III 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | | | | 1 | 1 | | 1 | | | | | | 1 | || | 1 | | || | | | | | | | 




Db 


304 


CCCCCGACAGAAGCAGTGGCTCAGCTAGCACAAGAACTCTACAGCAGTGGCCTGCTGGTG 


363 


Ov 
vy 


241 


APAPTPATAPPTPAPPTPPAPPTPATAPA PTTTP.APPP A A A A A A AP ATPTPA PPPAPATA 


o n n 






1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 I I I I 




Db 


364 


ACACTGATAGCTGACCTGCAGCTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATA 


423 


Ov 

vy 


O U J- 


TTTA A PA A PATPTTPAPA APA HAH ATA PPPA HTHHH APTPPTA PTPTPP APTATA mm A pm 
± 1 1 i-ii^^i-U-iL-M. 1L1 ILrA^AAljALALTAlALroLAL 1 LLroAo 1 LL iAL iLt1oLtA»o 1A1A1 lACjl 


o r n 
360 






1 11 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M M I I I I I M 1 1 




Db 


424 


TTTAACAACATCTTGAGAAGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGT 


483 


Ov 

vy 


7 £ 1 

jDJ. 


PPTP A TP PTPA T A TPPTPTTTA TP 1 PTPPTPA A A PP A """PA TV"* A 7\ r~* r^r^r~*r~*T\ f"i7\ /^* 7\ rpn rinrrimn 

LL 1 LA ILL! LA 1 A 1 LL i L I 1 1 A I LL 1 LL 1 LAAALLA i A 1 LAALLLLLALaLATTCjLLTTA 


42 0 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 
1 1 1 1 I 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 II 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I M 1 




Db 


484 


GCTC^TCCTCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTA 


543 


Ov 

vy 


491 

r± & J. 


HHT"T x H r VHHH A TT A TP PTP APAPA A TPT A TTPP APA TP A A PP A P' 1 " 1 'PPPA AAA T"PA mrinrnr* 
Hal iblbbbAi 1 A 1 LL 1 LALALAA 1 Li 1 A 1 I LLALA 1 LAALLALl 1LLLAAAA 1 LAI LL i L 


480 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 
I 1 1 II 1 1 1 1 1 1 1 1 1 1 j I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 I 1 1 1 




Db 




PPTTPTPPP A TT A TPPTP A P A P A A TPT A TT PP A PA TP A A PP A PTTPPPA A A A TP A TPPT'P 
i i\jiuuurti irtlULl <j^ji\KjMJ\ 1 1 /\ 1 1 LoALA 1 uAALLAL 1 1 LrLLAAAA 1 LA 1 LL 1 L 


b U J5 


Qy 


481 


TTTTCTAATCAATTGA.GAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCT 


540 






1 1 1 1 1 1 1 ! II 1 1 1 1 1 ! Ml 1 1 1 IM 1 1 II 1 II i II 1 1 1 III II 1 1 1 1 II II 1 Ml 1 1 1 1 




Db 


604 


TTTTCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCT 


663 


Qy 


541 


TCAGATGCCTTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGAC 


600 






1 II 1 1 1 1 1 M 1 1 1 i 1 1 1 1 II , 1 1 1 ! 1 1 1 1 1 1 1 1 1 II i ( I ! 1 1 1 1 1 1 1 1 i 1 1 ! 1 1 1 1 1 1 




Db 


664 


TlT^GATGCCTTTGCTACTTTCTUVGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGAC 


723 


Qy 


601 


TTCTTAGAA(L7UWVTTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAG 


660 






II 1 1 1 ! 1 II II 1 1 II 1 1 1 II i 1 1 IMM 1 II 1 II 1 1 1 1 MMi 1 II 1 Mill 1 MM Ml 




Db 


724 


TTCTTAGAACAAAATTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAG 


783 


Qy 


661 


AATTATGTTACTAAGAGACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCAC 


720 






1 II 1 M II MM 1 III 1 II II M 1 llMll II 1 II 1 M 1 II III II 1 II IM IMM IM 




Db 


784 


AATTATGTTACTAAGAGACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCAC 


843 



Qy 


721 


Db 


844 


Qy 


781 


Db 


904 


Qy 


841 


Db 


964 


Qy 


901 


Db 


1024 


Qy 


961 


Db 


1084 



AACTTTGCCATCATGACAAAGTATAT 7 8 0 

1 1 1 MM 1 1 1 III 1 1 III Ml 1 1 1 II 1 1 1 1 1 1 1 1 1 II I II I II II I M II 1 1 1 1 II I II 

AACTTTG CCATCATGACAAAGTATATCAG CAAG C CGGAGAAC CTGAAACTCATGATGAAC 903 
CTCCTTCGGGATAAAAGTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTT 84 0 

I M I MM M I II III 1 1 1 III M I MM MM 1 1 1 1 1 1 II I II 1 1 1 II M 1 1 1 III 1 1 

CTCCTTCGGGATAAAAGTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTT 963 
GTGGCCAGTCCTCACAAAACACAGCCTAT^ 9 00 

I I M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M I ! M ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1! 1 1 1 1 1 1 M 1 1 1 1 1 1 1 

GTGGCCAGTCCTCACAAAACACAGCCTATTGTGGAGATCCTGTTA 1023 
CTCATTGAGTTTCTGAGCAGCTTCCAAAAAGAAAGGACGGATGATGAGCAGTTCGCTGAC 960 

1 1 M 1 1 1 1 1 ;i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : ! 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 II 1 1 

CTCATTGAGTTTCTGAGCAGCTTCCAAAAAGAAAGGACGGATGATGAGCAGTTCGCTGAC 1083 
GAGAAGAACTACTTGATTAAACAGATCCGAGACTTGAAGAAAACGGCCCCTTGA 1014 

1 1 1 1 1 ; 1 1 1 M 1 1 1 1 1 1 1 , 1 1 1 1 1 1 1 1 1 1 1 1, 1 1 1 1 1 M 1 1 1 1 1 1 1 1! 1 1 1 1 1 

GAGAAGAACTACTTGATTAAACAGATCCGAGACTTGAAGAAAACGGCCCCTTGA 1137 



RESULT 3 
US-09-470-253-2 

; Sequence 2, Application US/09470253 

; Patent No. 6365371 

; GENERAL INFORMATION: 

; APPLICANT: Tang, Y. Tom 

; APPLICANT: Guegler, Karl J. 

; APPLICANT: Corley, Neil C. 

; APPLICANT: Gorgone, Gina A. 

; TITLE OF INVENTION: CALCIUM BINDING PROTEIN 
; FILE REFERENCE: PF-0635 US 

; CURRENT APPLICATION NUMBER: US/0 9/4 70,253 

;. CURRENT FILING DATE: 1999-12-22 

; PRIOR APPLICATION NUMBER : 09/190,965 

PRIOR FILING DATE: 1998-11-13 
; NUMBER OF SEQ ID NOS : 5 
; SOFTWARE: PERL Program 
; SEQ ID NO 2 

LENGTH: 134 4 
TYPE : DNA 

ORGANISM: Homo sapiens 
FEATURE : - 

OTHER INFORMATION: 3734805 
US-09-470-253-2 



Query Match 99.7%; Score 1010.8; DB 4; Length 1344; 

Best Local Similarity 99.8%; Pred. No. 4.2e-291; 

Matches 1012; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 
Qy 1 ATGAAAAAAATGCCTTTGTTTAGTAAATCACACAAAAATCCAGCAGAAATTGTGAAAATC 60 

M M MM MM 1 1 1 II I II 1 1 1 Mill I Mill 1 1 1 III Mill II 1 1 MM 1 1 1 III 

Db 124 ATGAAAAAAATGCCTTTGTTTAGTAAATCACACAAAAATCCAGC^ 183 

Qy 61 CTGAAAGACAATTTGGCCATTTTGGAAAAGCAAGAOWW^ 12 0 

1 1 1 1 1 i M 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1! 1 1 1 1 1 1 M i 1 1 1 1 1 M i 1 1 1 1 M 1 1 1 II 1 1 II ! II 



Db 



184 CTGAAAGACAATTTGGCCATTTTGGAAAA^ 243 



Qy 


121 


Db 


244 


Qy 


181 


Db 


304 


Qy 


241 


Db 


364 


Qy 


301 


Db 


424 


Qy 


361 


Db 


484 


Qy 


421 


Db 


544 


Qy 


481 


Db 


604 


Qy 


541 


Db 


664 


Qy 


601 


Db 


724 


Qy 


661 


Db 


784 


Qy 


721 


Db 


844 


Qy 


781 


Db 


904 


Qy 


841 


Db 


964 


Qy 


901 


Db 


1024 



II 1 1 1 M I ' 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 



CCCCC^CAGAAGC^GTGGCTCAGCTAGCACAAGAACTCTACAGCAGTGGCCTGCTAGTG 24 0 

Mill II I M 1 1 1 1 1 1 1 1 M 1 1 1 1 M I II 1 1 1 1 1 1 1 1 II 1 1 1 M 1 1 II 1 1 II 1 1 III 

CCCCCGACAGAAGCAGTGGCTCAGCTAGCACAAGAACTCTACAGCAGTGGCCTGCTGGTG 363 
ACACTGATAGCTGACCTGCAGCTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATA 3 00 

1 1 ' 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 

ACACTGATAG CTGAC CTG CAG CTGATAGACTTTGAGGGAAAAAAAGATGTGA CCCAGATA 423 
TTTAACAACATCTTGAGAAGACAGATAGG CACTCGGAGTC CTACTGTGGAGTATATTAGT 360 

1 1 1 II 1 1 1 1 1 1 1 1 II Ml II 1 1 1 Ml I M I Ml 1 1 ;M 1 1 Mi 1 1 1 II 1 1 1 1 1 ,1 1 1 1! I 

TTTAACAACATCTTGAGAAGACAGATAGG CACT CGGAGT CCTACTGTGGAGTATATTAGT 483 



1 1 Ml II III 1 1 M Ml Ml M Ml 1 1 M M 1 1 MM 1 1 IM 1 1 1 M I II I MM 1 1 II 



CGTTGTGGGATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTC 48 0 

M 1 1 M 1 1 M I MM M III 1 1 1 M 1 1 1 1 1 1 M 1 1 1 1 1 1 III 1 1 1 1 M 1 1 M 1 1 II 1 1 1 1 

CGTTGTGGGATTATG CTGAGAGAATGTATTCGACATGAACCACTTG C CAAAAT CAT C CTC 603 
TTTTCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCT 54 0 

M 1 1 M M M 1 1 M II IM 1 1 1 i M 1 1 1 1 1 1 1 1 1 1 1 1 1 IM 1 1 1 MM 1 1 1 1 1 1 IM 1 1 1 

TTTTCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCT 663 
TC^GATGCCTTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGAC 600 

I II II I II II II 1 1 1 III III III III II II II I IM M II 1 1 II III IM I III II 1 1 

TCAGATGCCTTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGAC 723 
TTCTTAGAACAAAATTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAG 660 

1 1 1 1 1 1 1 M I M M Ml I M M M 1 1 M 1 1 1 1 M 1 1 1 1 1 1 M 1 1 1 1 1 M 1 1 II I M I 

TTCTTAGAACAuAAATTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAG 783 
AATTATGTTACTAAGAGACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCAC 72 0 

I II M I II II II 1 1 II 1 1 III II I II III III MM 1 1 1 MM II 1 1 II 1 1 MM II II I 

AATTATGTTACTAAGAGACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCAC 843 
AACTTTGCCATCATGACAAAGTATATCAGCAAGCCGGAGAACCTGAAACTCATGATGAAC 780 

II 1 1 Mi I !i! 1 1 1 1 II I II 1 1 1 II i II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 II ill 1 1 IM 1 1 1 

AACTTTGCCATCATGACAAAGTATATCAGCAAGCCGGAGAACCTGAAACTCATGATGAAC 903 



MMMMMMMMMMMMMMMMMMMMMMMMMMMMMI 

'TCCTTCGGGATAAAAGTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTT 
TGGCCAGTCCTCACAAAACACAGCCTATTGTGGAGATCCTC 

IMMMMMMMMMMMMMMMMMMMMMMMMMMMMM 

TGGCCAGTCCTCACAAAACAC^GCCTATTGTGGAGATCCTGTTAAAAAATCAGCCCAAA 
'TCATTGAGTTTCTGAGCAGCTTCCAAAAAGAAAGGACGGATGATGAGCAGTTCGCTGAC 

MMMMMMMMMMMMMMMMMMMMMMMMMMMMMI 



Qy 961 GAGAAGAACTACTTGATTAAACAGATCCGAGACTTGAAGAAAACGGCCCCTTGA 1014 

M 1 1 1 1 II I II 1 1 III I III II 1 1 III 1 1 1 1 1! 1 1 II 1 1 1 1 II 1 1 1 1 II I If I 

Db 1084 GAGAAGAACTACTTGATTAAACAGATCCGAGACTTGAAGAAAACGGCCCCTTGA 1137 



RESULT 4 

US-08-232-463-14/C 

; Sequence 14, Application US/08232463 

; Patent No. 5670367 

; GENERAL INFORMATION: 

APPLICANT: DORNER, F . 

APPLICANT : SCHEIFLINGER, F. 

APPLICANT: FALKNER, F. G. 

TITLE OF INVENTION: RECOMBINANT FOWLPOX VIRUS 
NUMBER OF SEQUENCES : 52 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Foley & Lardner 

STREET: 1800 Diagonal Road, Suite 500 

CITY: Alexandria 

STATE : VA 

COUNTRY : USA 

ZIP: 22313-0299 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patent In Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08 /232 , 4 63 

FILING DATE: 

CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US/ 07 / 93 5 , 3 13 

FILING DATE: 

APPLICATION NUMBER: EP 91 114 300.6 

FILING DATE: 26-AUG-1991 
ATTORNEY/AGENT INFORMATION: 

NAME: BENT, Stephen A. 

REGISTRATION NUMBER: 29,768 

REFERENCE/DOCKET NUMBER: 30472/114 IMMU 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (703)836-9300 

TELEFAX: (703)683-4109 

TELEX: 899149 
INFORMATION FOR SEQ ID NO: 14: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 7218 base pairs 

TYPE: nucleic acid 

STRANDEDNESS : single 

TOPOLOGY: linear 
IMMEDIATE SOURCE: 

CLONE: pTZgpt-Fls 
US-08-232-463-14 



Query Match 5.1%; Score 51.6; DB 1; Length 7218; 

Best Local Similarity 3.6%; Pred. No. 4.2e-05; 



r 
v 

Matches 12; Conservative 196; Mismatches 130; Indels 0; Gaps 0; 

Qy - 1 ATGAAAAAAATGCCTTTGTTTAGTAA 60 

I || | | | | Mil | : ::: : : ::::: :: ::::: : ::::: 
Db 14 56 AAGAGATAGAAGAATTTGGTACRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR 13 97 

Qy 61 CTGAAAGACAATTTGGCCATTTTGGAAA 120 

Db 13 96 RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR 1337 

Qy 121 GAAGTGTCTAAATCACTGCAAGCAATGAAAGAAATTCTGTGTGGTACAAACGAGAAAGAA 18 0 

Db 133 6 RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR 1277 

Qy 181 CCCCCAACAGAAGCAGTGGCTCAGCTAGCACAAGAACTCTACAGCAGTGGCCTGCTAGTG 24 0 

Db 127 6 RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR 1217 

Qy 241 ACACTGATAGCTGAC CTGCAG CTGATAGACTTTGAGGGAAAAAAAGATGTGAC C CAGATA 3 00 

Db 1216 RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR 1157 

Qy 3 01 TTTAACAACATCTTGAGAAGACAGATAGGCACTCGGAG 338 

Db 1156 RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR 1119 

RESULT 5 

US-09-214-307A-9 

Sequence 9, Application US/09214307A 
Patent No. 6544516 
GENERAL INFORMATION: 
APPLICANT: NEUTEC PHARMA PLC 

TITLE OF INVENTION: TREATMENT AND DIAGNOSIS OF INFECTIONS OF GRAM POSITIVE 
TITLE OF INVENTION: COCCI 
FILE REFERENCE: PM 259204 

CURRENT APPLICATION NUMBER : US/0 9/2 14 , 3 07A 
CURRENT FILING DATE: 1999-01-04 
PRIOR APPLICATION NUMBER: PCT/GB97/0183 0 
PRIOR FILING DATE: 1997-07-07 
PRIOR APPLICATION NUMBER: GB9614274.0 
PRIOR FILING DATE: 1996-07-06 
NUMBER OF SEQ ID NOS : 15 
SOFTWARE : Patent In Ver . 2.1 
SEQ ID NO 9 
LENGTH: 14 57 
TYPE : DNA 

ORGANISM: Staphylococcus aureus 
US-09-214-307A-9 

Query Match 3.7%; Score 37.8; DB 4; Length 1457; 

Best Local Similarity 47.0%; Pred . No. 0.23; 

Matches 150; Conservative 0; Mismatches 167; Indels 2; Gaps 1; 
Qy 43 0 ATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTCTTTTCTAAT 489 

I MM I I I I Mill I I III MM II Mill MM 

Db 386 AATATGAGAACTGTAGTTGATCGACCTAGAACACAATATAAAAAAGTCGTCTTTAATAAT 445 



Qy 4 90 C^TTCAGAGATTTCTTTAAGTACGTGGAGTTGT(^\ACATTTGATATTGCTTCAGATGCC 54 9 

III I II I I I I I I I II I I III I I III 

Db 4 46 TTATTTTATCAATTTAGTAAGGATGCCAACTTTGAACCTATTGCTTGTAGACCCTATCGT 505 

Qy 550 TTTG CTACTTT CAAGGATTTACTAACCAGACATAAAGTGTTGGTAG CAGACTT CTTAGAA 609 

III llllll 1 1 1 1 I I I I Mil 

Db 506 CCTCAAACAAAAGGGTCTGTTGAATCATTAGCTAAATTTGTTGAACAGCGTTTAAGACCA 565 

Qy 610 CAAAATTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAGAATTATGTT 669 

I I I I I II I I II I II I I I I I III II llllll 

Db 566 TACGATTATGAATTTTATGATGCTG - -TAGAACTTATTGGGCTAGTAAACGATTTATGTC 623 

Qy 67 0 ACTAAGAGACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCACAACTTTGCC 729 

II II II Ml II I I llllll MM I III I 

Db 624 ACGAATTGAATCACTTAGAAATTTCACAAGCAACAGAACAACGACCTATCGACGTTTTCA 683 

Qy 730 ATCATGACAAAGTATATCA 748 

II MM II I I II 

Db 684 ATTATGAAGAAAAAGAACA 702 



RESULT 6 

US-09-620-312D-3 90/C 

Sequence 390, Application US/09620312D 
Patent No, 6569662 
GENERAL INFORMATION: 



APPLICANT: Tang, Y. Tom 
APPLICANT: Liu, Chenghua 
APPLICANT: Asundi , Vinod 
APPLICANT: Zhang, Jie 
APPLICANT: Ren, Feiyan 
APPLICANT: Chen, Rui-hong 
APPLICANT: Zhao, Qing A. 
APPLICANT: Wehrman, Tom 
APPLICANT: Xue, Aidong J. 
APPLICANT: Yang, Yonghong 
APPLICANT : Wang, Jian-Rui 
APPLICANT: Zhou, Ping 
APPLICANT: Ma, Yunqing 
APPLICANT: Wang, Dunrui 
APPLICANT: Wang, Zhiwei 
APPLICANT: John Tillinghast 
APPLICANT: Drmanac, Rado j e T. 

TITLE OF INVENTION: No. 6569662el Nucleic Acids and 
TITLE OF INVENTION: Polypeptides 
FILE REFERENCE: 784CIP2B 

CURRENT APPLICATION NUMBER: US/ 09/62 0 , 3 12D 
CURRENT FILING DATE: 2 000-07-19 
PRIOR APPLICATION NUMBER: 09/552,317 
PRIOR FILING DATE: 2000-04-25 
PRIOR APPLICATION NUMBER: 09/488,725 
PRIOR FILING DATE: 2000-01-21 
NUMBER OF SEQ ID NOS : 1105 
SOFTWARE: pt_FL_genes Version 1.0 
SEQ ID NO 3 90 
LENGTH: 4103 



TYPE : DNA 

ORGANISM: Homo sapiens 
FEATURE : 
NAME/ KEY: CDS 
LOCATION: (104) . . (3493) 
US-09-620-312D-390 

Query Match 3.7%; Score 37.4; DB 4; Length 4103; 

Best Local Similarity 60.2%; Pred. No. 0.53; 

Matches 62; Conservative 0; Mismatches 41; Indels 0; Gaps 0; 
Qy 6 AAAAATGCCTTTGTTTAGTAAATCACACAAAAATCCA.^ 65 

I Mill 1 1 1 1 I I I II M M II II Mill 

Db 4 0 91 ACAAATGAGAAAGTTTCATTTACCTCAAAAAAATCC^GGCTATAC 4 032 

Qy 66 AGACAATTTGG C CATTTTGGAAAAG CAAGA CAAAAAGACAGAC 108 

II II I II MM MM Ml I MUM I II 

Db 4 031 AGCCACATAGGAAATTTCCGAAACACAAAAGAAAAAGTCTCAC 3 98 9 



RESULT 7 
US-08-726-214-5 

; Sequence 5, Application US/08726214 

; Patent No. 6107076 

; GENERAL INFORMATION: 

APPLICANT: Tang, Wei -Jen 

APPLICANT: Gilman, Alfred G. 

TITLE OF INVENTION: SOLUBLE MAMMALIAN ADENYLYL CYCLASE 
TITLE OF INVENTION: AND USES THEREFOR 
NUMBER OF SEQUENCES: 31 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Arnold, White & Durkee 

STREET: P.O. Box 4433 

CITY: Houston 

STATE: Texas 

COUNTRY: United States of America 

ZIP : 77210 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08 /72 6 , 2 14 

FILING DATE: Concurrently Herewith 

CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 60/005,498 

FILING DATE: 04 -OCT- 1995 
ATTORNEY/AGENT INFORMATION: 

NAME: Highlander, Steven L. 

REGISTRATION NUMBER: 37,642 

REFERENCE/DOCKET NUMBER: UTSD:450 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (512) 418-3000 

TELEFAX: (512) 474-7577 
INFORMATION FOR SEQ ID NO: 5: 



SEQUENCE CHARACTERI STI CS : 
LENGTH: 4533 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : s ingl e 
TOPOLOGY: linear 
US-08-726-214-5 

Query Match 3.6%; Score 3 6.4; DB 3; Length 4 533; 

Best Local Similarity 56.8%; Pred. No. 1.1; 

Matches 67; Conservative 0; Mismatches 51; Indels 0; Gaps 0; 
Qy 718 CACAACTTTG C CAT CATGACAAAGTATATCAGCAAG C CGGAGAAC CTGAAACT CATGATG 777 

I II I lllllll! I I Mill I II III 1 1 1 1 lllllll II 

Db 2644 CTCATCGCCAC(^TCATGCTGGTGCAGGTCAGCCACATGGTGAAGCTGA(^CTaVTGCTG 2703 

Qy 7 78 AACCTCCTTCGGGATAAAAGTCCCAAOVTCCAGTTTGAAGCCTTTCATGTTTTTAAGG 83 5 

III II I I III II II II II I II Ml III I I 

Db 27 04 CTCGTCACAGGCGCCGTGACTGCCATCAACCTGTATGCCTGGTGTCCTGTCTTTGATG 2761 



RESULT 8 

US-09-513-057C-20/C 

Sequence 20, Application US/09513057C 
Patent No. 6433251 
GENERAL INFORMATION: 
APPLICANT: Wagner, et al . 

TITLE OF INVENTION: GENES REGULATING CIRCADIAN CLOCK FUNCTION AND 
PHOTOPERIODISM 

FILE REFERENCE: 1505-54357 

CURRENT APPLICATION NUMBER: 03/09/513,0570 
CURRENT FILING DATE: 2000-02-24 
NUMBER OF SEQ ID NOS : 3 5 
SOFTWARE: Patentln version 3.1 
SEQ ID NO 20 
LENGTH: 577 
TYPE : DNA 

ORGANISM: Lycopersicon esculentum 
US-09-513-057C-20 

Query Match 3.5%; Score 35.6; DB 4; Length 577; 

Best Local Similarity 51.2%; Pred. No. 0.64; 

Matches 83; Conservative 0; Mismatches 79; Indels 0; Gaps 0; 
Qy 4 57 GAACCACTTGCCAAAATO^TCCTCTTTTCTAATCAATTCAGAGATTTCTTTAAGTACGTG 516 

II III I II III II Mill I I III MINIM! I I 

Db 2 23 GACCCAAATACCCAAAACACAATCTTTACATA^ 164 

Qy 517 GAGTTGTCAACATTTGATATTGCTTCAGATGCCTTTGCTACTTTCAAGGATTTACTAACC 576 

II II II I III I I II II I II III I II II 

Db 163 AAGCAAAAAAGATGTATAATTTCACAAAATTACTATTATATTTTTCTGTGATCATGTAAC 104 

Qy 577 AGACATAAAGTGTTGGTAGCAGACTTCTTAGAACAAAATTAC 618 

II I I lllllll MM I Mill 

Db 103 AGG CCTTGTTGGTAAG CA CAATAATATGAAGAAAGAGATTAC 62 



RESULT 9 



US-09-276-531-42/C 

Sequence 42, Application US/09276531 
Patent No. 6183968 
GENERAL INFORMATION: 

APPLICANT: Bandman, Olga 
APPLICANT: Lai, Preeti 
APPLICANT: Hillman, Jennifer L. 
APPLICANT: Yue , Henry 
APPLICANT: Reddy, Roopa 
APPLICANT: Guegler, Karl J. 
APPLICANT: Baughn, Mariah R. 

TITLE OF INVENTION: COMPOSITION FOR THE DETECTION OF GENES ENCODING 
TITLE OF INVENTION: RECEPTORS AND PROTEINS ASSOCIATED WITH CELL 
PROLIFERATION 

NUMBER OF SEQUENCES: 134 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: INCYTE PHARMACEUTICALS, INC. 
STREET: 3174 PORTER DRIVE 
CITY: PALO ALTO 
STATE : CALIFORNIA 
COUNTRY : USA 
ZIP: 94304 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Word Perfect 6.1 for Windows /MS -DOS 6.2 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 09/276 , 53 1 
FILING DATE: Herewith 
CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 60/079,677 
FILING DATE: March 27, 1998 
CLASSIFICATION: 
ATTORNEY/AGENT INFORMATION: 
NAME: Lynn E. Murry, Ph.D. 
REGISTRATION NUMBER: 42,918 
REFERENCE/DOCKET NUMBER: PA- 0 008 US 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (650) 855-0555 
TELEFAX: (650) 845-4166 
INFORMATION FOR SEQ ID NO: 42: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 3707 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : s ingle 
TOPOLOGY: linear 
IMMEDIATE SOURCE: 

LI BRARY : CERVNOT0 1 
CLONE: 936117 
US-09-276-531-42 



Query Match 3.5%; 
Best Local Similarity 51.9%; 
Matches 80; Conservative 



Score 35.6; DB 3; Length 3707; 
Pred . No . 1.7; 
0; Mismatches 74; Indels 0; Gaps 



Qy 44 5 TGTATTCGACATGAACCACTTGCCAAAATCATCCTCTTTTCTAATCAATTCAGAGATTTC 504 

ii 111 i 111 ii i inn iiiiiii ii ii i i i ii 

Db 3154 TGCTTTCAAAATGTGGAACAAACTA?VAATATAAGGCTTTTCTGATAAACTATAAAAATTT 3095 

Qy 505 TTTAAGTACGTGGAGTTGTO^CATTTGATATTGCTTCAGATGCCTTTGCTACTTTCAAG 564 

I II II I I I I I I I I I I I I I I I I III I I I I 

Db 3 094 AATQ\GCACTTGGATCTAATGACATATCTTTATAATACTTCCTCTGCAGATACATTCACT 3 035 

Qy 565 GATTTACTAAC CAGACATAAAGTGTTGGTAG CAG 5 98 

III II Mill I III II III 

Db 3 034 TAGTTCAAACCTTAACATACAAAGTTAGTCTCAG 3 001 



RESULT 10 

US-09-620-312D-393/C 

Sequence 393, Application US/09620312D 
Patent No. 6569662 
GENERAL INFORMATION: 
APPLICANT: Tang, Y. Tom 
APPLICANT: Liu, Chenghua 
APPLICANT: Asundi , Vinod 
APPLICANT: Zhang, Jie 
APPLICANT: Ren, Feiyan 
APPLICANT: Chen, Rui-hong 
APPLICANT: Zhao, Qing A. 
APPLI CANT : Wehrman , Tom 
APPLICANT : Xue , Aidong J. 
APPLI CANT : Yang , Yonghong 
APPLICANT: Wang, Jian-Rui 
APPLICANT: Zhou, Ping 
APPLI CANT : Ma , Yunq i ng 
APPLICANT: Wang, Dunrui 
APPLICANT: Wang, Zhiwei 
APPLICANT: John Tillinghast 
APPLICANT: Drmanac, Radoj e T. 

TITLE OF INVENTION: No. 6569662el Nucleic Acids and 
TITLE OF INVENTION: Polypeptides 
FILE REFERENCE: 784CIP2B 

CURRENT APPLICATION NUMBER: US/ 0 9/ 62 0 , 3 12D 
CURRENT FILING DATE: 2000-07-19 
PRIOR APPLICATION NUMBER : 09/552,317 
PRIOR FILING DATE: 2000-04-25 
PRIOR APPLICATION NUMBER: 09/488,725 
PRIOR FILING DATE: 2000-01-21 
NUMBER OF SEQ ID NOS : 1105 
SOFTWARE: pt_FL_genes Version 1.0 
SEQ ID NO 393 
LENGTH: 5714 
TYPE: DNA 

ORGANISM: Homo sapiens 
FEATURE : 
NAME/ KEY: CDS 
LOCATION: (272) . . (4312) 
US-09-620-312D-393 



Query Match 3.5%; Score 35.6; DB 4; Length 5714; 

Best Local Similarity 51.9%; Pred. No. 2.2; 



Matches 80/ Conservative 0; Mismatches 74; Indels 0; Gaps 0; 

Qy 445 TGTATTCGACATGAACCACTTC 504 

II Ml I III I I I Mill Illllll II II I I I II 

Db 5233 TGCTTTCAAAATGTGGAACAAACTAAAATATAAGGCTTTTCTGATAAACTATAAAAATTT 5174 



Qy 505 TTTAAGTACGTGGAGTTGTCAACATTTGATATTGCTTCAGATGCCTTTGCTACTTTCAAG 564 

I M II MM I MM I I I I I I I III MM 

Db 5173 AATCAGCACTTGGATCTAATGACATATCTTTGTAATACTTCCTCTGCAGATACATTCACT 5114 

Qy 565 GATTTACTAACCAGACATAAAGTGTTGGTAGCAG 598 

I II II Mill I III II Ml 

Db 5113 TAGTTCAAACCTTAACATACAAAGTTAGTCTCAG 5 08 0 



RESULT 11 
US-09-004-838-124 

; Sequence 124, Application US/09004838 

; Patent No. 6350933 

; GENERAL INFORMATION: 

APPLICANT: Michelmore, Richard W. 

APPLICANT: Shen, Kathy 

APPLICANT: Meyers, Blake 

TITLE OF INVENTION: Procedures and Materials for 
TITLE OF INVENTION: Conferring Pest Resistance in Plants 
NUMBER OF SEQUENCES: 14 0 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Townsend and Townsend and Crew LLP 

STREET: Two Embarcadero Center, Eighth Floor 

CITY: San Francisco 

STATE: California 

COUNTRY : USA 

ZIP: 94111-3834 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patent In Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/004,83 8 

FILING DATE: 09-JAN-1998 

CLASSIFICATION: 800 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/781,734 

FILING DATE: 10-JAN-1997 
ATTORNEY/AGENT INFORMATION: 

NAME: Einhorn, Gregory P. 

REGISTRATION NUMBER: 38,440 

REFERENCE/DOCKET NUMBER: 023 07O- 078 8 10US 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (415) 576-0200 

TELEFAX: (415) 576-0300 
INFORMATION FOR SEQ ID NO: 124: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 12793 base pairs 

TYPE: nucleic acid 

STRANDEDNESS : s ingl e 



TOPOLOGY: linear 
MOLECULE TYPE: DNA 
FEATURE : 
NAME/ KEY: 

LOCATION: 1. . 12793 

OTHER INFORMATION: /note= " RG2S" 
US-09-004-838-124 

Query Match 3.5%; Score 3 5.2; DB 4; Length 12793; 

Best Local Similarity 47.6%; Pred. No. 4.4; 

Matches 101; Conservative 10; Mismatches 98; Indels 3; Gaps 1; 

Qy 438 GAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTCTTTTC 4 97 

I II hi : = || |: ::|::::| | Mill III III II 

Db 5 998 GAGARAGWAWGRRRGAKAKARMCSMSYTTGGGATGTGATACTTCTTTTAGGAAAATGGAG 6057 

Qy 4 98 AGATTTCTTTAAGTACGTGGAGTTGTCA ACATTTGATATTGCTTCAGATGCCTTTGC 554 

II Mill I II MM I II I II 1 1 III I III 

Db 6 058 TTATATCTTTGATATTGTATTTTTTTAATGTAATTTATATATTTAATCATTTTAGTTTAT 6117 

Qy 555 TACTTT(^GGATTTACTAACCAGA(^TAAAGTGTTGGTAGCAGACTTCTTAGAACAA^ 614 

I III I MM I II I I II 1 1 II II III I MM 

Db 6118 AAGTTTTATTTATTTTGATATGAAAAAAAAAGTCTTTTATACATTGGATTTAACATAAAA 6177 

Qy 615 TTACGACACTATTTTTGAAGACTATGAGAAAT 64 6 

Db 6178 ATCCAACAATATTAATCAAAAAGACCAMACAT 62 09 



RESULT 12 
US-08-961-083-89 

Sequence 89, Application US/08961083 
Patent No. 6159469 
GENERAL INFORMATION: 

APPLICANT: Choi et . al . 

TITLE OF INVENTION: Streptococcus pneumoniae Antigens and Vaccines 
NUMBER OF SEQUENCES: 452 
CORRESPONDENCE ADDRESS : 

ADDRESSEE: Human Genome Sciences, Inc. 
STREET: 9410 Key West Avenue 
CITY: Rockville 
STATE : Maryland 
COUNTRY : USA 
ZIP: 20850 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette, 3.50 inch, 1.4Mb storage 
COMPUTER: HP Vectra 4 86/33 
OPERATING SYSTEM: MSDOS version 6.2 
SOFTWARE: ASCII Text 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/961 , 083 
FILING DATE: 
CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 
APPLICATION NUMBER: 
FILING DATE: 
ATTORNEY /AGENT INFORMATION: 



NAME: Brookes, A. Anders 
REGISTRATION NUMBER: 36,373 
REFERENCE / DOCKET NUMBER: PB340P2 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (301) 309-8504 
TELEFAX: (301) 309-8512 
INFORMATION FOR SEQ ID NO: 89: 
SEQUENCE CHARACTER I ST I CS : 
LENGTH: 775 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : doubl e 
TOPOLOGY: linear 
US-08-961-083-89 

Query Match 3.5%; Score 35; DB 3; Length 775; 

Best Local Similarity 46.9%; Pred. No. 1.1; 

Matches 143; Conservative 0; Mismatches 160; Indels 2; Gaps 1; 

Qy 22 AGTAAATCACACAAAAATCCAGCA.GAA 81 

III II II I II Mill II III I III I I I 

Db 263 AGTCAACCATCAGACAAACCAGCTGAGGAATCAAAAGTTGA 322 

Qy 82 TTGGAAAAGCAAGACAAAAAGACAGACAAGGCTT(^ 141 

I II Mill Mill II I Mill I I II I I II 

Db 323 G CGCCAAGAGAAGACGAAAAGG CACCAGTCGAG CCAGAAAAG CAACCAGAAG CT CCTGAA 382 

Qy 142 G CAATGAAAGAAATTCTGTGTGGTACAAACGAGAAAGAACC C CCAACAGAAG CAGTGG CT 2 01 

I I Ml I I I Mil MM I I MM I MM 

Db 383 GAAGAGAAGGCTGTAGAGGAAACACCGAAACAAGAAGAGTCAACTCCAGATACCAAGGCT 442 

Qy 2 02 CAG CTAG CACAAGAACTCTACAG CAGTGG CCTG CTAGTGACACTGATAG CTGACCTG CAG 2 61 

I I I MM II I M I III II I I II I III II 

Db 443 GAAGAAACTGTAGAA- - CCAAAAGAGGAGACTGTTAATCAATCTATTGAACAACCAAAAG 50 0 



Qy 2 62 CTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATATTTAACAACATCTTGAGAAGA 321 

III I I I I Mill I II III III I I MM I 

Db 501 TTGAAACGCCTGCTGTAGAAAAACAAACAGAACCAACA^ 56 0 

Qy 322 CAGAT 326 

III I 

Db 561 CAGGT 565 



RESULT 13 
US-09-536-784-89 

; Sequence 89, Application US/09536784 
; Patent No. 6573082 

GENERAL INFORMATION: 

APPLICANT: Choi et . al . 

TITLE OF INVENTION: Streptococcus pneumoniae Antigens and Vaccines 
NUMBER OF SEQUENCES : 452 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Human Genome Sciences, Inc. 

STREET: 9410 Key West Avenue 

CITY: Rockville 

STATE: Maryland 

COUNTRY: USA 



ZIP: 20850 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette, 3.50 inch, 1.4Mb storage 
COMPUTER: HP Vectra 486/33 
OPERATING SYSTEM: MSDOS version 6.2 
SOFTWARE: ASCII Text 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 09/536 , 784 
FILING DATE: 30-Oct-1997 
CLASSIFICATION: <Unknown> 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/961,083 
FILING DATE: OCT-30-1997 
ATTORNEY/AGENT INFORMATION: 
NAME: Michelle S. Marks 
REGISTRATION NUMBER: 41,971 
REFERENCE/DOCKET NUMBER: PB34 0P3 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (301) 309-8504 
TELEFAX: (301) 309-8512 
INFORMATION FOR SEQ ID NO: 89: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 775 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : double 
TOPOLOGY: linear 
SEQUENCE DESCRIPTION: SEQ ID NO: 89: 
US-09-536-784-89 



Query Match 3.5%; 
Best Local Similarity 46.9%; 
Matches 143; Conservative 



Score 35; DB 4; Length 775; 
Pred . No . 1.1; 
0; Mismatches 160; Indels 



2; Gaps 



1; 



Qy 
Db 



22 AGTAAATCACACAAAAATCGAGCAGAAATTC 81 

Ml M II I II Mill II III I III I I I 

2 63 AGTCAACCATCAGACAAACCTVGCTGAGGAATCAAA 322 



Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 
Db 

Qy 
Db 



82 TTGGAAAAGCAAGACAAAAAGACAGACAAGGCTTCAGAAGAAGTGTCTAAATC^ 141 

I II Mill Mill II I Mill I I II I I II 

323 G CGCCAAGAGAAGACGAAAAGG CAC CAGTCGAG C CAGAAAAG CAACCAGAAG CT CCTGAA 382 
142 GCAATGAAAGAAATTCTGTGTGGTACAAACGAGAAAGAACCCCCAACAGAAGCAGTGGCT 2 01 

I I Ml I I I I II I MM I I MM I MM 

3 83 GAAGAGAAGGCTGTAGAGGAAA(^CCGAAAC^GAAGAGTCAACTC(^GATACCAAGGCT 442 
2 02 (^GCTAGC^CAAGAACTCTACAGCAGTGGCCTGCTAGTGACACTGATAGCTGACCTGCAG 2 61 

I I I 1 1 1 1 I I I II I Mill I I II I Ml II 

44 3 GAAGAAACTG TAGAA - - CCAAAAGAGGAGACTGTTAATCAATCTATTGAACAACCAAAAG 5 0 0 
262 CTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATATTTAACAACATCTTGAGAAGA 321 

III I I I I Mill I II III III I I MM I 

5 01 TTGAAACGCCTGCTGTAGAAAAACAAACAGAACCAA^ 560 

322 CAGAT 326 

III I 

561 CAGGT 565 



RESULT 14 
US-08-961-083-217 

; Sequence 217, Application US/08961083 

; Patent No. 6159469 

; GENERAL INFORMATION: 

APPLICANT: Choi et . al . 

TITLE OF INVENTION: Streptococcus pneumoniae Antigens and Vaccines 
NUMBER OF SEQUENCES : 452 
CORRESPONDENCE ADDRESS : 

ADDRESSEE: Human Genome Sciences, Inc. 

STREET: 9410 Key West Avenue 

CITY: Rockville 

STATE: Maryland 

COUNTRY : USA 

ZIP: 20850 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette, 3.50 inch, 1 . 4Mb storage 

COMPUTER: HP Vectra 4 8 6/33 

OPERATING SYSTEM: MSDOS version 6.2 

SOFTWARE: ASCII Text 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08 / 96 1 , 083 

FILING DATE: 

CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 

FILING DATE: 
ATTORNEY/AGENT INFORMATION: 

NAME: Brookes, A. Anders 

REGISTRATION NUMBER: 36,373 

REFERENCE/DOCKET NUMBER: PB340P2 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (301) 309-8504 

TELEFAX: (301) 309-8512 
INFORMATION FOR SEQ ID NO: 217: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 1696 base pairs 

TYPE: nucleic acid 

STRANDEDNESS : double 

TOPOLOGY: linear 
US-08-961-083-217 

Query Match 3.5%; Score 35; DB 3; Length 1696; 

Best Local Similarity 46.9%; Pred. No. 1.7; 

Matches 143; Conservative 0; Mismatches 160; Indels 2; Gaps 1; 

Qy 22 AGTAAATCACACAAAAATCCAGCAGAAATTGTGAAAATCCTGAAAGA 81 

III II II I II Mill II III I Ml I I I 

Db 275 AGTCAACCATCAGACAAAC(^GCTGAGGAATCAAAAGTTGAACAAGCAGGT^ 334 

Qy 82 TTGGAAAAGCAAGACAAAAAGACAGACAAGGCT^ 141 

I II MIM 1 1 1 1 1 II I Mill I I II I I II 

Db 33 5 G CG CCAAGAGAAGACGAAAAGG CAC CAGTCGAG C CAGAAAAG CAACCAGAAG CTC CTGAA 3 94 

Qy 142 GCAATGAAAGAAATTCTGTGTGGTACAAACGAGAAAGAACCCC 2 01 

I I III I I I I II I MM I I MM I MM 



Db 



395 GAAGAGAAGGCTGTAGAGGAAA(^CCGAAAC^GAAGAGT(^CTC(^GATACCAAGGCT 454 



Qy 2 02 CAG CTAG CACAAGAACTCTACAGCAGTGG C CTG CTAGTGACACTGATAG CTGACCTG CAG 2 61 

I I I I II I I I I II I III II II II I Ml I I 

Db 4 55 GAAGAAACTGTAGAA- - C(I!AAAAGAGGAGACTGTTAATCAATCTATTGAACAACCAAAAG 512 

Qy 2 62 CTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATATTTAACAACATCTTGAGAAGA 321 

Ml I I I I Mill I II III III I I MM I 

Db 513 TTGAAACGCCTGCTGTAGAAAAACAAACA.GAACCAACAGAGGAACCAAAAGTTGAAC^G 572 

Qy 322 CAGAT 326 

III I 

Db 573 CAGGT 577 



RESULT 15 
US-09-536-784-217 

; Sequence 217, Application US/09536784 
; Patent No. 6573082 

GENERAL INFORMATION: 

APPLICANT: Choi et . al . 

TITLE OF INVENTION: Streptococcus pneumoniae Antigens and Vaccines 
NUMBER OF SEQUENCES: 4 52 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Human Genome Sciences, Inc. 

STREET: 9410 Key West Avenue 

CITY: Rockville 

STATE: Maryland 

COUNTRY: USA 

ZIP : 20850 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette, 3.50 inch, 1.4Mb storage 

COMPUTER: HP Vectra 486/33 

OPERATING SYSTEM: MSDOS version 6.2 

SOFTWARE: ASCII Text 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/536 , 784 

FILING DATE: 30-Oct-1997 

CLASSIFICATION: <Unknown> 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/961,083 

FILING DATE: OCT-30-1997 
ATTORNEY/AGENT INFORMATION: 

NAME: Michelle S. Marks 

REGISTRATION NUMBER: 41,971 

REFERENCE/DOCKET NUMBER: PB34 0P3 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (301) 309-8504 

TELEFAX: (301) 309-8512 
INFORMATION FOR SEQ ID NO: 217: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 1696 base pairs 

TYPE: nucleic acid 

STRANDEDNESS : double 

TOPOLOGY: linear 
SEQUENCE DESCRIPTION: SEQ ID NO: 217: 
US-09-536-784-217 



Query Match 3.5%; Score 35; DB 4; Length 1696; 

Best Local Similarity 46.9%; Pred. No. 1.7; 

Matches 143; Conservative 0; Mismatches 160; Indels 2; Gaps 1; 

Qy 22 AGTAAATCACACAAAAATC 81 

III II II I II Mill II III I III I I I 

Db 275 AGTCAACCATCAGACAAACCAGCTGAGGAATCAAAAGTTC 334 

Qy 82 TTGGAAAAGCAAGACAAAAAGACAGACAAGG CTT CTAAATCACTGCAA 141 

I II Mill Mill II I Mill I I II I I II 

Db 335 GCGCCAAGAGAAGACGAAAAGGCACCAGTCGAGCC 3 94 

Qy 142 GCAATGAAAGAAATTCTGTGTGGTACA^CGAGAAAGAACCCCCAACAGAAGCAGTGGCT 2 01 

I I III I I I Ml I MM I I MM I MM 

Db 3 95 GAAGAGAAGG CTGTAGAGGAAACAC CGAAACAAGAAGAGT CA^ 4 54 

Qy 2 02 CAGCTAGCACAAGAACTCTACAGCAGTGGCCTGCTAGTGACACTGATAGCTGACCTGCAG 2 61 

I I I MM I I I II I III II I I II I Ml II 

Db 4 55 GAAGAAACTGTAGAA- - CCAAAAGAGGAGACTGTTAATCAATCTATTGAACAACCAAAAG 512 

Qy 262 CTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATATTTAACAACATCTTGAGAAGA 321 

MM I I I Mill I II III III I I MM I 

Db 513 TTGAAACGCCTGCTGTAGAAAAACAAACAGAACCAACA^^ 572 

Qy 322 CAGAT 326 

III I 

Db 573 CAGGT 577 



Search completed: January 6, 2004, 03:19:48 
Job time : 96 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 



Run Op: 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



Searched : 



January 6, 2004, 00:37:47 ; Search time 3965 Seconds 

(without alignments) 
10462.134 Million cell updates/sec 

US-10-088-872-1 
1014 

1 atgaaaaaaatgcctttgtt tgaagaaaacggccccttga 1014 

IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 



2888711 seqs, 20454813386 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



5777422 



Database : 



GenEmbl : * 

1 : gb_ba : * 

2 : gb_htg : * 

3 : gb__in : * 

4: gb_om:* 

5 : gb_OV : * 

6 : gb_pat : * 

7 : gb__ph : * 

8 : gb_pl : * 

9 : gb_pr : * 
10: gb_ro:* 

11: gb_StS:* 

12: gb_sy:* 
1 3 : gb_un : * 
14: gb__vi : * 
1 5 : em_ba : * 
16: em_fun:* 
17: em_hum:* 
18: em_in:* 
1 9 : em_mu : * 
2 0 : em_om : * 
21: em_or:* 
22: em_ov:* 
23: em_pat:* 
2 4 : em_ph : * 
25: em_pl:* 
2 6 : em_ro : * 
27: em sts:* 



28 


em 


un : * 


29 


em 


vi : * 


30 


em 


htcr hum • * 


31 


em 


hi - ci i nv - * 

IXO^J 111V . 


32 


em 




33 


em 




34 


em 




35 


em 


htg rod:* 


36 


em 


_htg_mam : * 


37 


em 


_htg_vrt : * 


38 


em 


sy : * 


39 


em 


htgo hum: * 


40 


em 


_htgo_mus : * 


41 


em_ 


_htgo_other : * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 
AX105381 

LOCUS AX105381 1014 bp DNA linear PAT 30-APR-2001 

DEFINITION Sequence 1 from Patent WO0123552. 
ACCESSION AX105381 

VERSION AX105381.1 GI: 13921508 

KEYWORDS 

SOURCE Homo sapiens (human) 

ORGANISM Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
REFERENCE 1 

AUTHORS den Daas,I. and Duecker,K. 

TITLE Human paralogue of a head trauma induced cytoplasmatic calcium 

binding protein 
JOURNAL Patent: WO 0123552-A 1 05-APR-2001; 
MERCK PATENT GmbH (DE) 
FEATURES Locat ion/Qual i f iers 

source 1. .1014 

/organism="Homo sapiens" 
/mol_type="genomic DNA" 
/db_xref ="taxon: 9606" 
CDS 1. .1014 

/note= "unnamed protein product" 

/codon_start=l 

/prot ein_id= " CAC3 773 5 . 1 " 

/db_xref ="GI : 13921509" 

/ 1 rans la t ion= " MKKMPLFSKSHKNPAEI VKI LKDNLAI LEKQDKKTDKASEEVSK 
SLQAMKEI LCGTNEKEPPTEAVAQLAQELYSSGLLVTLI ADLQLI DFEGKKDVTQI FN 
NILRRQIGTRSPTVEYISAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIIL 
FSNQFRDFFKYVELSTFDIASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQ 
S EN YVTKRQSLKLLGEL I LDRHNFAI MTKY I S KPENLKLMMNLLRDKS PN I QFEAFHV 
FKVFVASPHKTQPI VEILLKNQPKLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKK 
TAP" 

BASE COUNT 340 a 205 c 209 g 260 t 

ORIGIN 



Query Match 100.0%; Score 1014; DB 6; 

Best Local Similarity 100.0%; Pred. No. 6.9e-238; 
Matches 1014; Conservative 0; Mismatches 0; 



Length 1014; 

Indels 0; Gaps 0; 



Qy 1 ATGAAAAAAATGCCTTTGTTTAGTAAATC^ 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I 

Db 1 ATGAAAAAAATGCCTTTGTTTAGTAAATCACACAAAAATCCAGCAGAAATTGTGAAAATC 6 0 

Qy 61 CTGAAAGACAATTTGGCCATTTTGGAAAAGCAAGACAA 12 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I M I II I I I I I II I I I I I I I I I I I I 

Db 61 CTGAAAGACAATTTGGCCATTTTGGAAAAGCAAGACAAAAAGACAGACAAGGCTTCAGAA 12 0 

Qy 121 GAAGTGTCTAAATCACTGCAAGCAATGAAAGAAATTCTGTGTGGTACAAACGAGAAAG^ 180 

I I II II II II I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I 

Db 121 GAAGTGT CTAAATCACTG CAAG CAATGAAAGAAATT CTGTGTGGTACAAA CGAGAAAGAA 18 0 

Qy 181 CCCCCAACAGAAGCAGTGGCTCAGCTAGCACAAGAACTCTACAGCAGTGGCCTGCTAGTG 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 181 CCCCC^CAGAAGCAGTGGCTCAGCTAGCACAAGAACTCTACAGCAGTGGCCTGCTAGTG 24 0 

Qy 241 ACACTGATAG CTGAC CTG CAG CTGATAGACTTTGAGGGAAAAAAAGATGTGACC CAGATA 3 00 

1 1 1 1 1 1 ! I : M I M 1 1 1 1 1 1 1 1 1 1 1 1 Ml i M I i J 1 1 1 1 1 M I M 1 1 1 1 1 1 1 1 1 1 1 1 M 

Db 241 ACACTGATAGCTGACCTG CAG CTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATA 3 00 

Qy 3 01 TTTAACAACATCTTGAGAAGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGT 360 

I ! 1 1 1! 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1! 1 1 > I M I M 1 1 Mil I II M 1 1 1 1 II 1 1 1 Mill 1 1 1 

Db 301 TTTAACAACATCTTGAGAAGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGT 3 60 

Qy 361 GCTCATCCTCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTA 42 0 

M 1 1 M I M 1 1 1 II Ml 1 1 1 Ml III I MM M M 1 1 1 M 1 1 1 1 II 1 1 1 M M 1 1 1 1 Ml 

Db 361 GCTCATCCTCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTA 42 0 

Qy 421 CGTTGTGGGATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTC 48 0 

1 1 1 1 M I E 1 1 M I M M M I M : M I M M 1 1 1 1 1 1 1 M 1 1 II 1 1 M 1 1 1 1 1 M MM 1 1 

Db 421 CGTTGTGGGATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTC 48 0 

Qy 481 TTTT CTAAT CAATT CAGAGATTT CTTTAAGTA CGTGGAGTTGTCAA CATTTGATATTG CT 54 0 

I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I 

Db 481 TTTTCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCT 54 0 

Qy 541 TCAGATGCCTTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGAC 600 

i 1 1 1 1 1 1 Ml 1 1 M Ml 1 1 1 1 II 1 1 1 1 MM 1 1 M II 1 1 1 1 1 1 1 1 1 1 1 1 MM 1 1 1 1 Ml 

Db 541 TCAGATGCCTTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGAC 600 

Qy 601 TTCTTAGAAOW^TTACGAC^CTATTTTTGAAGACTATGAGAAATTGCTTC^GTCTGAG 660 

1 1 1 M I Ml 1 1 1 1 M 1 1 II I M 1 1 1 1 M M 1 1 Ml 1 1 M 1 1 1 1 M 1 1 1 M I M 1 1 1 1 M 

Db 601 TTCTTAGAAOUiAATTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAG 660 

Qy 661 AATTATGTTACTAAGAGACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCAC 720 

1 1 1 1 1 1 1 M 1 1 1 M M I M M M I i 1 1 M 1 1 1 M I II 1 1 1 1 1 1 1 1 M 1 1 M 1 1 1 1 1 1 Ml 

Db 661 AATTATGTTACTAAGAGACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCAC 72 0 

Qy 721 AACTTTGCCATCATGACAAAGTATATCAGCAAGCCGGAGAACCTGAAACTCATGATGAAC 78 0 

1 1 1 1 1 1 1 M I MM I M I M I i M 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 II 1 1 1 1 II 1 1 M 1 1 1 1 1 1 M 

Db 721 AACTTTGCCATCATGACAAAGTATAT CAG CAAG CCGGAGAACCTGAAACTCATGATGAAC 78 0 

Qy 781 CTCCTTCGGGATAAAAGTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTT 84 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 781 CTCCTTCGGGATAAAAGTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTT 84 0 



Qy 841 GTGGCCAGTCCTCACAAAACACAGCCTATTGTGGAGATCCTGTT^ 900 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 841 GTGGCCAGTCCTCAQW^ACACAGCCTATTGTGGAGATCCTGTTAAAAAATCAGCCCAAA 900 

Qy 901 CTCATTGAGTTTCTGAGCAGCTTCCAAAAAGAAAGGACGGATGATGAGCAGTTCGCTGAC 960 

1 1 1! 1 1! I II 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1! 1 1 1 1 1 II 1 1 1 MM 1 1 1 III 1 1 1 III 1 1 II 

Db 901 CTCATTGAGTTTCTGAGCAG CTTCCAAAAAGAAAGGACGGATGATGAGCAGTT CG CTGAC 960 

Qy 961 GAGAAGAACTACTTGATTAAACAGATC CGAGA CTTGAAGAAAACGGC C CCTTGA 1014 

I M 1 1 1 1 1 1 M 1 1 1 M IM I IM 1 1 1 1 1 1 1 M I M I M 1 1 1 1 M 1 1 1 1 1 1 1 1 1 

Db 961 GAGAAGAACTACTTGATTAAACAGATC CGAGACTTGAAGAAAACGGC C CCTTGA 1014 



RESULT 2 
AR097361 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 

BASE COUNT 
ORIGIN 



AR097361 
Sequence 2 
AR097361 
AR097361.1 



from patent 



1344 bp 
US 6071721. 



DNA 



linear 



PAT 14-FEB-2001 



GI : 12806091 



Unknown . 

Unknown . 

Unclassified. 

1 (bases 1 to 1344) 

Tang, Y.Tom., Guegler , K. J . , Corley,N.C. and Gorgone,G.A. 

Calcium binding protein 

Patent: US 6071721-A 2 06-JUN-2000; 

Location/Qualifiers 

1. .1344 

/ organ i sm= " unknown " 
450 a 261 c 280 g 353 t 



Query Match 9 9.7%; 

Best Local Similarity 99.8%; 
Matches 1012; Conservative 



Score 1010.8; DB 6; 
Pred. No. 4.2e-237; 
0; Mismatches 2; 



Length 1344; 



Indels 



0; Gaps 



0; 



Qy 

Db 



1 ATGAAAAAAATGCCTTTGTTTAGTAAATCACACAAAAATCCAGCAGA 60 

1 1 M I II 1 1 1 IM 1 1 1 II 1 1 II 1 1 1 1 1 IM I II IMM 1 1 1 II II I Mill MMI I II II 

124 ATGAAAAAAATGCCTTTGTTTAGTAAATCACACAAAAATCCAGC^ 183 



Qy 

Db 

Qy 

Db 

Qy 
Db 



61 CTGAAAGACAATTTGGCCATTTTGGAAAAGCAAGACAAAAAGAC^^ 120 

MMMMMMM1MMMMMMMMMMMMMMMMMMMMMMI 

184 CTGAAAGACAATTTGG C CATTTTGGAAAAG CAAGACAAAAAGACAGACAAGG CTTCAGAA 24 3 

121 GAAGTGTCTAAATCACTGCAAGCAATGAAAGAAATTCTGTGTGGTACAAACGAGAAAG^ 18 0 

I M I II I M 1 1 M 1 1 1 1 1 1 M 1 1 M 1 1 M I M 1 1 M 1 1 1 1 1 M I II I M 1 1 1 II 1 1 M 

244 GAAGTGTCTAAATCACTGCAAGCAATGAAAGAAATTCTGTGTGGTACAAACGAGAAAGAA 3 03 



181 



240 



CCCCCAACAGAAGCAGTGGCTCAGCTAGCACAAGAACTCTACAGCAGTGGCCTGCTAGTG 

MMI MMMMMMMMMMMMMMMMMMMMMMMMM Ml 

3 04 CCCCCGACAGAAGCAGTGGCTCAGCTAGCACAAGAACTCTACAGCAGTGGCCTGCTGGTG 363 



Qy 

Db 



241 ACACTGATAG CTGACCTGCAG CTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATA 3 00 

i 1 1 1 1 M I i i 1 1 ! 1 1 1 1 1 1 i 1 1 1 1 II ! I M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M I i 1 1 1 1 1 1 1 

364 ACACTGATAGCTGAC CTG CAG CTGATAGACTTTGAGGGAAAAAAAGATGTGA C CCAGATA 423 



Qy 


301 


Db 


424 


Qy 


361 


Db 


484 


Qy 


421 


Db 


544 


Qy 


481 


Db 


604 


Qy 


541 


Db 


664 


Qy 


601 


Db 


724 


Qy 


661 


Db 


784 


Qy 


721 


Db 


844 


Qy 


781 


Db 


904 


Qy 


841 


Db 


964 


Qy 


901 


Db 


1024 


Qy 


961 


Db 


1084 



TTTAACAACATCTTGAGAAGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGT 360 

I I I I I I I I II I I I I I II I I II I I I I I I > I II i I I I I I I I I I II I I I I I I I I I I I I I I I 

TTTAACAACATCTTGAGAAGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGT 483 

GCTCATCCTCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTA 42 0 

I M 1 1 1 1 M 1 1 1 1 1 1 1 ! 1 1 II 1 1 1 1 1 1 M 1: 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 

GCTCATCCTCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTA 543 
CGTTGTGGGATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTC 48 0 

1 1 1 1 1 1 ! i 1 1 1 M ; 1 1 1 1 1 li 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 M II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

CGTTGTGGGATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTC 603 

TTTTCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCT 54 0 

I I I I I I I i I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

TTTTCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCT 663 

TCAGATGCCTTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGAC 600 

: II III 1 1 1 1 1 1 II I II 1 1 1 II 1 1 1 II 1 1 1 1 1 1 M 1 1 1 1 II 1 1 1 II 1 1 1 1 1 Ml 1 1 ! 1 1 1 

TCAGATGCCTTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGAC 723 



L 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M M 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 ! 1 1 



i II 1 1 1 1 1 1 M I II Ml 1 1 1 1 M I Ml 1 1 1 1 1 1 II 1 1 MM 1 1 1 1 1 1 1 1 II I II 1 1 Ml I 



AACTTTGCCATCATGACAAAGTATATCAGCAAGCCGGAGAACCTGAAACTCATGATGAAC 78 0 

II M 1 1 1 M M I M M 1 1 M II 1 1 M 1 1 1 1 1 1 1 II 1 1! M I II 1 1 1 1 1 M II 1 1 1 M 1 1 

AACTTTGCC^TC^TGACAAAGTATATCAGCAAGCCGGAGAACCTGAAACTCATGATGAAC 9 03 



I M 1 1 1 M 1 1 1 1 1 M I M I Ml 1 1 1 1 M I Ml I II II II 1 1 1 M 1 1 II II 1 1 1 1 1 1 1 M 



GTGGCCAGTCCTCACAAAACACAGCCTATTGTGGAGATCCTGTTAAAAAATCAGCCCAAA 900 

I II 1 1 1 1 M 1 1 1 1 II M II II 1 1 1 1 1 M I M M 1 1 M II 1 1 M III II 1 1 1 1 1 1 1 1 1 M 

GTGGCCAGTCCTCACAAAACACAGCCTATTGTGGAGATCCTGTTAAAAAATCAGCCCAAA 1023 
CTCATTGAGTTT CTGAG CAG CTT CCAAAAAGAAAGGACGGATGATGAG CAGTTCG CTGAC 960 

1 1 1 1 M 1 1 1 1 M 1 1 1 M II II I II 1 1 1 M 1 1 II II II 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 

CTCATTGAGTTTCTGAGCAGCTTCCAAAAAGAAAGGACGGATGATGAGCAGTTCGCTGAC 1083 
GAGAAGAACTACTTGATTAAACAGATCCGAGACTTGAAGAAAACGGCCCCTTGA 1014 

M MM 1 1 1 1 M MM 1 1 1 1 1 II 1 1 1 1 M I II 1 1 1 M 1 1 1 1 II 1 1 II M M I M 

GAGAAGAACTACTTGATTAAACAGAT CCGAGACTTGAAGAAAACGG CC C CTTGA 1137 



RESULT 3 
AR203365 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 



AR203365 
Sequence 
AR203365 
AR203365 . 

Unknown . 



1344 bp DNA 
2 from patent US 6365371. 

1 GI:21499736 



linear PAT 20-JUN-2002 



ORGAN I SM Unknown . 

Unclassified . 
REFERENCE 1 (bases 1 to 1344) 

AUTHORS Tang, Y.Tom. , Guegler , K. J . , Corley,N.C. 
TITLE Calcium binding protein 

JOURNAL Patent: US 6365371-A 2 02-APR-2002; 
FEATURES Loca t ion/Qual i f ier s 

source 1. .1344 

/organi sm= " unknown " 
BASE COUNT 450 a 261 c 280 g 353 t 

ORIGIN 



and Gorgone,G.A. 



Query Match 99.7%; 
Best Local Similarity 99.8%; 
Matches 1012; Conservative 



Score 1010.8; DB 6; 
Pred. No. 4.2e-237; 
0; Mismatches 2; 



Length 1344; 
Indels 0; Gaps 



0; 



Qy 


l 


Db 


124 


Qy 


61 


Db 


184 


Qy 


121 


Db 


244 


Qy 


181 


Db 


304 


wy 


9 A 1 
Z *± L 


Db 


364 


Qy 


301 


Db 


424 


Qy 


361 


Db 


484 


Qy 


421 


Db 


544 


Qy 


481 


Db 


604 


Qy 


541 


Db 


664 



ATGAAAAAAATGCCTTTGTTTAGTAAATCACACAAAAATCOVGCAGAAATTGTGAAAATC 6 0 

I 1 1 1 M I M 1 1 : 1 1 ! I ! 1 1 1 1 1 1 1: 1 1 1 1 M ! 1 1 ! 1 1 1 1 1 1 1 Ml ' 1 1 1 1 ! 1 1 : 1 M II 

ATGAAAAAAATGCCTTTGTTTAGTAAATCACACAAAAATCCAGCAGAAATTGTGAAAATC 183 
CTG AAAGACAATTTGGC CATTTTGGAAAAG CAAGACAAAAAGACAGACAAGG CTT CAGAA 12 0 

1 : 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 M 1 1 1 i 1 1 1 1 1 1 1 1 II 1 1 1 1 1 M 1 1 1 1 1 II 

CTGAAAGAC^TTTGGCCATTTTGGAAAAGCAAGACAAAAAGAC^ 243 
GAAGTGTCTAAATCACTGCAAGCAATGAAAGAAATTCTGT 18 0 

I i 1 1 1 1 1 1 1 1 1 1 M i M 1 1 1 1 1 1 1 1 M 1 1 ! I ! i 1 1 1 1 1 1 1 1 1 1 1 II : 1 1 1 1 M I M 1 1 ! 

GAAGTGTCTAAAT(^CTG(^GCAATGAAAGAAATTCTGTGTGGTACAAACGAGAAAGAA 303 
CCCCCAACAGAAGCAGTGGCTCA.GCTAGCACAAGAACTCTACAG(^GTG 24 0 

Mill M 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 II 1 1 1 i 1 1 1 II 1 1 1 II 1 1 1 Mill 1 1 Ml 

CCCCCGACAGAAGCAGTGGCTCAGCTAGCACAAGAACTCTACAGCAGTGGCCTGCTGGTG 3 63 



MUM I II M! 1 1 1 1 1 III I II 1 1 1 1 1 < 1 1 1 1 1 II !M II 1 1 1 1 1 1 III Ml 1 1 Ml li 



TTTAACAACAT CTTGAGAAGACAGATAGG CACTCGGAGT CCTACTGTGGAGTATATTAGT 3 6 0 

I II II 1 1 II I II III II II MM I III II I M MM II II I IE I III II I Mill II 1 1 

TTTAACAACAT CTTGAGAAGACAGATAGG CACTCGGAGTCCTACTGTGGAGTATATTAGT 4 83 

GCTCATCCTCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTA 42 0 

I I I M I I II I I I I I I I I I I I I I I I II II I I || I I I I I I I || I | | || | | | Ml II I II III 

GCT(^TCCTCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTA 543 

CGTTGTGGGATTATGCTGAGAGAATGTATTCGACJA.TGAACCACTTGCCAAAATCATCCTC 480 

MM 1 1 II I II II! MM MM I MM 1 1 II I III IMM M I II I II I Mill II I II 

CGTTGTGGGATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTC 603 



II 1 1 1 II 1 1 1 1 1 1 1 1 1 1 INI 1 1 II III Ml I II 1 1 1 II II I Mill I II Ml 



TCAGATGCCTTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGAC 600 

MMMMMMMMMMMMMMMMMMMMMMMMMMMMMM 

TC^GATGCCTTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGAC 723 



Qy 



601 TTCTTAGAACAAAATTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAG 660 



Db 


724 


Qy 


661 


Db 


784 


Qy 


721 


Db 


844 


Qy 


781 


Db 


904 


Qy 


841 


Db 


964 


Qy 


901 


Db 


1024 


Qy 


961 


Db 


1084 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 II 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 

TTCTTAGAACAAAATTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAG 783 
AATTATGTTACTAAGAGACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCAC 72 0 

1 1 1 li 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 II 1 1 1 1 1 1 II M 1 1 1 1 II 1 1 1 1 1 1 1 1 1 M 1 1 1 1 

AATTATGTTACTAAGAGACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCAC 843 



I II 1 1 1 1 1 1 Ml 1 1 II I II I INI I II 1 1 II 1 1 M 1 1 1 III II! 1 1 II I Ml II 1 1 1 II 



CTCCTTCGGGATAAAAGTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTT 84 0 

I M 1 1 M 1 1 1 1 1 M 1 1 1 M I M 1 1 1 1 1 Ml 1 1 1 Ml 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 

CTCCTTCGGGATAAAAGTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTT 963 
GTGGCCAGTCCTCACAAAACACAGCCT 900 

I MM II I II II M M II II MM M III II I Ml II II II II II III II II I II 1 1 Ml 

GTGGCCAGTCCTCACAAAACACAGCCTATTGTGGAGATCCTG 1023 
CT CATTGAGTTT CTGAG CAG CTTC CAAAAAGAAAGGACGGATGATGAGCAGTTCG CTGAC 960 

MMMMMMMMMMMMMMMMMMMMMMMMM MMMMM 

CT CATTGAGTTTCTGAG CAG CTTC CAAAAAGAAAGGACGGATGATGAGCAGTTCGCTGAC 1083 
GAGAAGAACTACTTGATTAAACAGATCCGAGACTTGAAGAAAACGGCCCCTTGA 1014 

MMMMMMMMMMMMMMM MMMMM MM MMMMM 

GAGAAGAACTACTTGATTAAACAGATCCGAGACTTGAAGAAAACGGCCCCTTGA 1137 



RESULT 4 
BC010993 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



REMARK 
COMMENT 



BC010993 1491 bp mRNA linear PRI 25-JUL-2001 

Homo sapiens, hypothetical protein FLJ12577, clone MGC: 15031 
IMAGE: 3956127, mRNA, complete cds . 
BC010993 

BC0109 93 .1 GI : 15012172 
MGC. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 1491) 
Strausberg, R . 
Direct Submission 

Submitted (23-JUL-2001) National Institutes of Health, Mammalian 
Gene Collection (MGC) , Cancer Genomics Office, National Cancer 
Institute, 31 Center Drive, Room 11A03, Bethesda, MD 20892-2590, 
USA 

NIH-MGC Project URL: http://mgc.nci.nih.gov 
Contact: MGC help desk 
Email : cgapbs-r^mail .nih.gov 
Tissue Procurement: ATCC 

cDNA Library Preparation: Rubin Laboratory 

cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 

DNA Sequencing by: Institute for Systems Biology 

http : //www. systemsbiology . org 

contact : amadanw)systemsbiology . org 

Anup Madan, Rachel Dickhoff, Jessica Fahey, Stephanie Ford, Julia 



Greene, Mark Ketteman and Anuradha Madan 



FEATURES 

source 



CDS 



BASE COUNT 
ORIGIN 



Clone distribution: MGC clone distribution information can be found 
through the I.M.A.G.E. Consort ium/LLNL at: http://image.llnl.gov 
Series: IRAL Plate: 25 Row: k Column: 12 

This clone was selected for full length sequencing because it 
passed the following selection criteria: matched mRNA gi : 10434146. 

Location/Qualifiers 

1. .1491 

/organism="Homo sapiens" 

/mol_type=" mRNA" 

/db_xref="LocusID: 81617" 

/db_xref="taxon: 9606" 

/clone="MGC: 15031 IMAGE : 3 956127" 

/tissue_type=" Placenta, choriocarcinoma" 

/ clone_l ib= " NIH_MGC_2 1 " 

/lab_host=" DH10B-R" 

/note- "Vector : pOTB7" 

416. .1285 

/codon_start=l 

/product = "hypothetical protein FLJ12577" 
/protein_id="AAH10993 .1" 
/db_xref="GI : 15012173" 

/ trans la t ion= " MKEI LCGTNEKEPPTEAVAQIiAQELYSSGLLVTLIADLQLI DFE 
EKKDVTQI FNNI LRRQIGTRSPTVEYI SAHPHI LFMLLKGYEAPQIALRCGIMLRECI 
RHEPLAKI ILFSNQFRDFFKYVELSTFDIASDAFATFKDLLTRHKVLVADFLEQNYDT 
IFEDYEKLLQSENYVTKRQSLKLLGELILDRHNFAIMTKYISKPENLKLMMNLLRDKS 
PNI QFEAFHVFKVFVAS PHKTQP I VE I LLKNQPKLI EFLSSFQKERTDDEQFADEKNY 
LIKQIRDLKKTAP" 
503 a 290 c 305 g 393 t 



Query Match 99.5%; 
Best Local Similarity 99.7%; 
Matches 1011; Conservative 



Score 1009.2; DB 9; 
Pred. No. le-236; 
0; Mismatches 3; 



Length 1491; 
Indels 0; Gaps 



0; 



Qy 
Db 



1 ATGAAAAAAATGCCTTTGTTTAGTAAATCACACAAAAATCCAGCA 60 

1 1 1 1 1 1 L M I M 1 1 1 ! I 1 1 M ! 1 1 1 1 1 1 M 1 1 1 1 II ! 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 

272 ATGAAAAAAATGCCTTTGTTTAGTAAATCACACAAAAATCCAGC^ 331 



Qy 

Db 

Qy 
Db 



61 CTGAAAGACAATTTGGCCATTTTGGAAAAGCAAGACAAAAAGA 12 0 

I : I I I I I I I I I I I I i I I I I I I I I I I I I I ! I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I 

332 CTGAAAGACAATTTGGCCATTTTGGAAAAGCAAGACAAAAAGA 391 



121 



180 



GAAGTGTCTAAATCACTGC^GCAATGAAAGAAATTCTGTGTGGTAC7U\ACGAGAAAGAA 

M 1 1 1 1 1 1 1 ! M I M 1 1 1 1 1 1: 1 1 1 1 1 Ml M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 II 1 1 1 1 

3 92 GAAGTGT CTAAAT CACTGCAAG CAATGAAAGAAATT CTGTGTGGTACAAA CGAGAAAGAA 451 



Qy 
Db 

Qy 
Db 

Qy 



181 CCCCCAACAGAAGCA,GTGGCT(^GCTAG(^CAA.GAACTCTA(^GCAGTGGCCTGCTAGTG 24 0 

Mill 1 1 INI 1 1 II 1 1 1 1 M II I Mil 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Ml II II I II I Ml 

452 CCCCCGACAGAAGCAGTGGCTCAGCTAGCACAAGAACTCTACAGCAGTGGCCTGCTGGTG 511 



241 



300 



ACACTGATAGCTGACCTGCAGCTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATA 

IMMMMMMMMMMMMMMMMMM I II I M M II 1 1 1 1 1 M I M 1 1 

512 ACACTGATAGCTGACCTGCAGCTGATAGACTTTGAGGAAAAAAAAGATGTGACCCAGATA 571 
3 01 TTTAACAACATCTTGAGAAGACAGATAGG CACT CGGAGTCCTACTGTGGAGTATATTAGT 360 



Db 

Qy 
Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 
Db 

Qy 
Db 

Qy 
Db 

Qy 
Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 



1 1 Ml 1 1 1 II 1 1 1 1 1 II 1 1 1 II II 1 1 1 1 Ml 1 1 111 1 1 Ml! 1 1 1 II II M 1 1 II II 1 1! 

572 TTTAACAACATCTTGAGAAGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGT 631 
361 GCTCATCCTCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTA 42 0 

I M 1 1 1 1 1 M 1 1 M Ml 1 1 M I M I MM I M 1 1 M I M I M 1 1 1 M 1 1 M I II 1 1 M I 

632 GCTC^TCCTC^TATCCTGTTTATGCTCCTO^GGATATGAAGCCCaVCAGATTGCCTTA 691 
421 CGTTGTGGGATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTC 480 

M II I Ml 1 1 1 1 1 1 1 II M 1 1 1 III 1 1 Mill I II 1 1 1 1 II I M I M II I II Mill Ml 

692 CGTTGTGGGATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTC 751 
4 81 TTTTCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCT 54 0 

II II 1 1 1 MM 1 1 II II 1 1 1 M II I II III I II III 1 1 1 M I II I MM I II IM 1 1 Ml 

7 52 TTTTCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCT 811 
541 TCAGATGCCTTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGAC 60 0 

I M M I II M M 1 1 MM I M I M I M I! II 1 1 M I M M I II II I II I M Ml II II 

812 TCAGATGCCTTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGAC 871 



601 



872 



661 



932 



721 



992 



781 



TTCTTAGAACAAAATTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAG 660 

MM II II M I M M II M II I M M II I M II M 1 1 M II I M M II I M MM II II 

TTCTTAGAACAAAATTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAG 931 
AATTATGTTACTAAGAGACAGTCTTTAAAGCTG CTAGGGGAG CTGAT CCTGGACCGT CAC 72 0 

MM III 1 1 1 1 1 M MM I M III 1 1 III 1 1 M 1 1 M 1 1 II I II I MM III I II 1 1 III 

AATTATGTTACTAAGAGACAGTCTTTAAAGCTG CTAGGGGAG CTGAT CCTGGACCGTCAC 991 
AACTTTGCCATCATGACAAAGTATATCAGC^^ 78 0 

I ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 ! I IN I 

AACTTTGCCATC^TGACAAAGTATATC^GCAAGCCGGAGAACCTGAAACTa^TGATGAAC 1051 



840 



CTCCTTCGGGATAAAAGTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTT 

1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 M II I II I II 1 1 1 1 II 1 1 1 M I II 1 1 1 1 1 1 M I M M II I 

1052 CTCCTTCGGGATAAAAGTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTT 1111 



841 



900 



GTGGCCAGTCCTCACAAAACACAGCCTATTGTGGAGATCCTG 

II II 1 1 1 MM 1 1 II II 1 1 1 MM 1 1 1 1 II I II III 1 1 1 III M I II II I II III Mill 

1112 GTGGCCAGTCCT(^C^UWVC^(^GCCTATTGTGGAGATCCTGTTAAAAAATCAGCCCAAA 1171 



90] 



960 



CTCATTGAGTTTCTGAG CAG CTTC CAAAAAGAAAGGACGGATGATGAGCAGTTCG CTGAC 

MM MMMMMMMMMMMMMMMMMMMMMMMMMMMM 

1172 CT CATTGAGTTTCTGAG CAG CTTC CAAAAAGAAAGGACGGATGATGAG CAGTTCG CTGAC 1231 
961 GAGAAGAACTACTTGATTAAACAGAT CCGAGACTTGAAGAAAACGG C CCCTTGA 1014 

MMMMMMMMMMMIM MMMMMMMMMMMMMMI 

12 32 GAGAAGAACTACTTGATTAAACAGATCCGAGACTTGAAGAAAACGGCCCCTTGA 12 85 



RESULT 5 
BD157871 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



BD157871 2002 bp DNA linear PAT 17-JAN-2003 

Primer for synthesizing full-length cDNA and use thereof. 
BD157871 

BD1578 71. 1 GI : 27863629 
JP 2002191363-A/12714 . 
Homo sapiens (human) 
Homo sapiens 



REFERENCE 
AUTHORS 

TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



BASE COUNT 
ORIGIN 



Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 2002) 

Ota,T., Isogai,T., Nishikawa , T . , Hayashi,K., Saito,K., Yamamoto , J . , 
Ishii,S., Sugiyama,T., Wakamatsu,A. , Nagai,K. and Otsuki,T. 
Primer for synthesizing full-length cDNA and use thereof 
Patent: JP 2002191363-A 12714 09-JUL-2002; 
HELIX RESEARCH INSTITUTE 
OS Homo sapiens (human) 
JP 2002191363-A/12714 
09-JUL-2002 

28-JUL-2000 JP 2000280990 

TOSHIO OTA , TAKAO ISOGAI , TETSUO NISHIKAWA, KOJI HAYASHI , KAORU 
SAITO, 

JUNICHI YAMAMOTO,SHIZUKO I SHI I , TOMOYASU SUGI YAMA, AI WAKAMATSU , 
KEIICHI NAGAI , TETSUJI OTSUKI 



PN 
PD 
PF 
PI 
PI 
PI 
PI 
PC 

C12N15/09, C07K14/47 ,C07K16/l8, C12N1/ 15, C12N1/19 , C12N1/21 , C12N5/ PC 
10, 

PC C12P21/02,C12Ql/68//C12P2l/08,G06F17/3 0 / C12N15/00,C12N5/00 CC 
Primer for synthesizing full-length cDNA and use thereof FH Key 
Location/Qualifiers 
FT CDS (127) . . (993) . 

Location/Qualifiers 
1. .2002 

/organism="Homo sapiens" 
/mol_type= "genomic DNA" 
/db_xref ="taxon: 9606" 
594 a 418 c 463 g 527 t 



Query Match 97 . 9%; 

Best Local Similarity 99.8%; 
Matches 994; Conservative 



Score 992.8; DB 6; 
Pred. No. l.le-232; 
0; Mismatches 2; 



Length 20 02; 



Indels 



0 ; Gaps 



0; 



Qy 
Db 



1 9 TTTAGTAAATCACACAAAAATCCAGCAGAAATTGTGAA^ 

lllllllllllllllllllllllll llllll MINI lllllllllll llllllllllll 

1 TTTAGTAAATCACACAAAAATCCAGCAGAAATTGTGAAAATCCTGAAAGA 



78 



60 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 



79 ATTTTGGAAAAGCAAGACAAAAAGACAGACAAGGCT 138 

1 1 1 1 Ml! 1 1 1 1 1 1 1 1 1 1 1 1 III I! ! II 1 1 1 1 1 1 II I II I II I II 1 1 1 II I II Ml II 1 1 

61 ATTTTGGAAAAGCAAGACAAAAAGACAGACAAGGCTTCAGAAGAAGTGTCTA 12 0 

139 CAACJCAATGAAAGAAATTCTGTGTGGTACA^ 198 

MMMMMMMMMIMMMMMMMMMMMMMMMMMMMMI 

121 (^VAGCAATGAAAGAAATTCTGTGTGGTACAAACGAGAAAGAACCC^ 180 



199 



258 



GCTCAGCTAG CACAAGAACT CTACAGCAGTGG C CTGCTAGTGACACTGATAGCTGACCTG 

MMMMMMMMMMMMMMMMMMM MMMMMMIMMMM 

181 GCTCA.GCTAG(^O^GAACTCTACAGCAGTGGCCTGCTGGTGACACTGATAGCTGACCTG 24 0 
2 5 9 CAG CTGATAGACTTTGAGGGAAAAAAAGATGTGACC CAGATATTTAACAACATCTTGAGA 318 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

241 CAGCTGATAGACTTTGAGGGAAAAAAAGATGTGACCC^GATATTTAACAACATCTTGAGA 3 00 



319 AGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGTGCTCATCCTCATATCCTG 

MMMMMMMMMMMMMMMMMMMMMMMMMMMMMM 



378 



Db 


3 01 


ALxA.LA.L7A IALAj LA L. 1 L-LiLtALj 1 L-L 1AL ILj ILjLiALj 1 A 1 A 1 lALrlvjU 1 L-tt.1 J- V-~rt.-Li-v j. v^v^ ivj 




Qy 


379 


TTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTACGTTGTGGGATTATGCTG 


438 




1 1 Ml 1 II 1 II 1 1 1 1 I 1 1 1 II I 1 1 1 II III 1 1 1 II 1 1 1 II 1 II 1 1 1 1 1 1 1 M Ml 1 1 M 




Db 


3 61 


mmm A rpnprpppmpTv tv 7\pn7\TaTPa A P ppppn P A P A TTP PPTTA PPTTPTPPP A TTATrTPTP 
1 1 1 Al LjL 1 LL 1 LAfiiibbii 1 A 1LxAALiLL.LLAL-M.LtH. 1 1LjL.L> 1 lALu 1 lo luuun a irtiuv- J. kj 


420 


Qy 


439 


AGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTCTTTTCTAATCAATTCAGA 


498 




llllllllllllllllllllllllllll IIIIIIIIMIIIIIIIIIIIIIMIIIIII 




Db 


421 


AP7\PJ\ A TP*TA TTPP A PA TP A A PPA PTTPTPB A A A TP A TPPTPTTTTPT A A TP A A TTP AP A 

ALxALxAA 1 Lj 1 A 1 1 L-LtAlA 1 LiAALLAL 1 1 Lj 1 LAAAA 1 LA 1 L-L. 1LII 1 it xrtJ\ i \^t\±\ x I 


4ft 0 


Qy 


499 


GATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCTTCAGATGCCTTTGCTACT 


558 




II i II Ml 1 1 1 1 1 1 1 1 1 1 1 Ml 1 1 1 II 1 1 1 1 1 MM II II 1 1 II 1 1 1 1 1 1 II III 1 1 M 




Db 


4 o 1 


r»A tttpttta APT A PPTPP A PTTPTP A A PA TTTP AT ATTPPTTPAPATPPPTTTPiPTA PT 
UA 1 1 1L1 1 1 AALj 1 ALLj ILtLxALj 1 ILj 1 LJ-iriLi-i. 1 1 iun lr\ X X uL 1 1 v_j-vvjj-i. x vjv-k_ -L ± ±\j^ ±t\.\^ x 


54 0 


Qy 


559 


TTCAAGGATTTACTAACCAGACATaAAGTGTTGGTAGCAGACTTCTTAGAACaaaATT^ 


618 




M 1 Mill 1 II 1 II II 1 II MM 1 III II 1 II II Ml 1 II 1 III III II 1 II III 1 III 




Db 


tr a i 
b4 1 


t'tp a A PPA TTTA fTA APPAPAPATAA APTPTTPPT APP AP A PTTPTTAP A A P A A A ATT A P 


U W \J 


Qy 


619 


GACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAGAATTATGTTACTAAGAGA 


678 




II III il II II 1 II II 1 II MM 1! Ill II 1 MM III II 1 1 II 1 III II 1 Mill MM 




Db 


6 01 


ni\ni\ /-lrp7\ i-prpryirprp/1 A A P 1 A PTA TP A P A A A TTP PTT P A P T PTP A P A ATTATPTTA PTA A PAPA 

LxAvJAL. 1 A 1 1 1 1 1 LxAA-LrAL. 1 A 1 bAbAAA 1 1 LjL^ 1 1 LALj 1L1 bAbAA 1 1 A 1 Li 1 I.H.L. 1 i-Ltt.Lji-iLxtt. 


D D \J 


Qy 


679 


CAGTCTTTAAAGCTGCTAGGGGAGCTGATCCT^^ 


738 




II II! M 1 1 II 1 II M 1 M Ml 1 1 III II 1 II II II 1 1 II 1 II 1 II II 1 II 1 II 1 1 M 




Db 


661 


t~\-r\ f i rp^-ir-pi 1 » 1 '7\ A A P HTP OTA P^P^P'P 1 A P 1 PTP A TPPTPP A PPPTP A PA A PTTTPPP A TP A TP A PA 

LACjTLTTj-AAAbjL lULlAUOLioAoL luAl 1 IbLLAlLAlbALA 


79 0 


Qy 


739 


AAGTATATCAGCAAGCCGGAGaACCTGAAACTCATGATGAACCTCCTTCGGGATAaAAGT 


798 




!l 1 1 1 1 1 1 1 1 1 1 1 1 i : 1 1 1 II i 1 1 1 1 1 1 1 M 1 1 ! 1 II M 1 1 : 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


/zl 


A AP i TATATP i AP 1 P 1 A A P i PP i P 1 P A P 1 A A PPTP AAA PTP A TP ATP A A PPTPPTTPPPP A T A A A APT 
AACj 1 A 1 A 1 LAULAALrL-LLioALiAACl- 1 LjAAAL, 1 LA 1 LxA 1 LrAAL,L. 1LL11 L-LfLiLtA i /^H-BJ-ibr 1 


7 A 0 


Qy 


799 


CCOUVC^TCC^GTTTGAAGCCTTTCA.TGTTTTTAAGGTGTTTGTGGCl^GTCCTCA 


858 




MM II 1 M 1 M IM II M M II 1 1 1 Ml III III II 1 II III II 1 Mill II MM II 




Db 


•"7 O T 

781 


PPP7\ 7\ ri7\ rpz-'P 1 A PTTTPA A P 1 P t PTTTPA TPTTTTTA A PPTPTTTPTPPPP APTPPTP A PA A A 

LLLAAlAI L-LALjI 1 ILxAALjLL-I 1 ILAlLrl 1 1 i IAALjLjILjI 1 ILj lL7L7L.LA.Ljl LL 1 LALAAR 


0 *± \j 


Qy 


859 


ACACAGCCTATTGTGGAGATCCTGTTAaaAAATl^GCCCAAACTCATTGAGTTTCTGAGC 


918 




II 1 II 1 II 1 1 MMMI M II II 1 1 1 Ml Ml III II 1 II III II 1 1! III II MM II 




Db 


841 


AL^L^GCCTATTGTGGAGATCCTGTTAAAAAATCAGCCaU^CTCATTGAGTTTCTGA 


900 


QY 


Q 1 Q 

y i y 


A PPTTPPA A A A AP A A APP A PPP A TP A TPAPPAPTTPPPTP A PPAP A AP A A PTA PTTP ATT 


97ft 




M 1 II III 1 1 Ml MM 1 II II 1 1 MM II 1 II! 1 1 1 II 1 II II 1 II III II 1 II II 1 




Db 


901 


AGCTTCOVaaaAGaAAGGACGGATGATGAGCAGTTCGCTGACGAGaAGaACTACTTGATT 


960 


Qy 


979 


AAACAGATCCGAGACTTGAAGaAAACGGCCCCTTGA 1014 




Db 


961 


IMIIMIIIIIIIIIIIMIIIIIIIMIMIII 

AAACAGATCCGAGACTTGaAGAAAACGGCCCCTTGA 996 





RESULT 6 
AK022639 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



AK022639 2002 bp 

Homo sapiens cDNA FLJ12577 fis, 
to M025 PROTEIN. 
AK022639 

AK02263 9 . 1 GI : 1043414 6 

oligo capping; fis (full insert sequence) 
Homo sapiens (human) 
Homo sapiens 



mRNA linear PRI 01-AUG-2002 
clone NT2RM4001047, highly similar 



Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
REFERENCE 1 

AUTHORS Isogai,T., Ota,T., Hayashi,K., Sugiyama,T., Otsuki,T., Suzuki, Y. # 
Nishikawa,T. , Nagai,K., Sugano,S., Shiratori , A. , Sudo,H., 
Wagatsuma,M. , Hosoiri,T., Kaku,Y., Kodaira,H., Kondo,H., 
Sugawara,M., Takahashi , M . , Chiba,Y., Ishida,S., Murakawa,K., 
Ono,Y. , Takiguchi , S . , Watanabe,S., Kimura,K., Murakami , K. , 
Ishii,S., Kawai,Y., Saito,K., Yamamoto, J. , Wakamatsu,A. , 
Nakamura,Y., Nagahari,K., Masuho,Y., Ninomiya,K. and Iwayanagi,T. 

TITLE NEDO human cDNA sequencing project 

JOURNAL Unpubl i shed 
REFERENCE 2 (bases 1 to 2002) 

AUTHORS Isogai # T. and Otsuki,T. 

TITLE Direct Submission 

JOURNAL Submitted (23 -AUG-2000 ) Takao Isogai, Helix Research Institute, 

Genomics Laboratory; 1532-3 Yana, Kisarazu, Chiba 292-0812, Japan 
(E-mail : genomicsw)hri .co.jp, Tel : 81-438-52-3 975 , Fax : 81 -438 -52 -3 986) 
COMMENT NEDO human cDNA sequencing project supported by Ministry of 

International Trade and Industry of Japan; cDNA full insert 
sequencing: Research Association for Biotechnology; cDNA library 
construction, 5'- & 3 ' -end one pass sequencing and clone selection: 
Helix Research Institute (supported by Japan Key Technology Center 
etc.) and Department of Virology, Institute of Medical Science, 
University of Tokyo. 
FEATURES Loca t i on/ Qua 1 i f i er s 

source 1. .2002 

/organism- "Homo sapiens" 

/mol_type = " mRNA n 

/db_xref="taxon: 9606" 

/clone="NT2RM4 00104 7" 

/cell_line="NT2" 

/ cell_type=" teratocarcinoma " 

/clone_lib="NT2RM4 " 

/note- "cloning vector: pME18SFL3~mRNA from uninduced NT2 
neuronal precursor cells." 
CDS 127. .996 

/not e= "unnamed protein product" 
/ codon_s t art = 1 
/protein_id="BAB14147 .1" 
/db_xref ="GI : 10434147" 

/ trans la t ion= "MKEI LCGTNEKEPPTEAVAQLAQELYSSGLLVTLIADLQLI DFE 
GKKDVTQIFNNILRRQIGTRSPTVEYISAHPHILFMLLKGYEAPQIALRCGIMLRECI 
RHEPLVKIILFSNQFRDFFKYVELSTFDIASDAFATFKDLLTRHKVLVADFLEQNYDT 
IFEDYEKLLQSENYVTKRQSLKLLGELILDRHNFAIMTKYISKPENLKLMMNLLRDKS 
PNIQFEAFHVFKVFVASPHKTQPIVEILLKNQPKLIEFLSSFQKERTDDEQFADEKNY 
LIKQIRDLKKTAP" 

BASE COUNT 594 a 418 c 463 g 527 t 

ORIGIN 



Query Match 97 . 9%; 

Best Local Similarity 99.8%; 
Matches 994; Conservative 



Score 992.8; DB 9 
Pred. No. l.le-232 
0; Mismatches 2 



Length 2 002; 

Indels 0; Gaps 0; 



Qy 19 TTTAGTAAATCACACAAAAATCCAGCAGAAATTGTGAA 78 

M 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 lh 1 1 ' I i 1 1 ! II 1 1 II I IN I II 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 

Db 1 TTTAGTAAATCACACAAAAATCCAGCAGAAATTGTGAAAATCCTGAAAGACAATTTGGCC 6 0 



Qy 


79 


Db 


61 


Qy 


139 


Db 


121 


Qy 


199 


Db 


181 


Qy 


259 


Db 


241 


Qy 


319 


Db 


301 


Qy 


379 


Db 


361 


Qy 


439 


Db 


421 


Qy 


499 


Db 


481 


Qy 


559 


Db 


541 


Qy 


619 


Db 


601 


Qy 


679 


Db 


661 


Qy 


739 


Db 


721 


Qy 


799 


Db 


781 


Qy 


859 


Db 


841 



ATTTTGGAAAAGCAAGACAAAAAGA 138 

1 1 1 1 1 M M 1 1 1 M 1 1 1! I II 1 1 1 1 M 1 1 1 1 1 1 Ml Ml I M 1 1 1 1 M I ! 1 1 1 1 1 1 1 1 1 

ATTTTGGAAAAGCAAGACAAAAAGACAGACAAGGCTTCAG^ 12 0 

CAAGCAATGAAAGAAkTTCTGTGTGGTAC^ 198 

M 1 1 M 1 1 M 1 1 M I Ml I II 1 1 1 1 1 1 M 1 1 1 Ml I M M 1 1 M M 1 1 1 1 1 M M 1 1 M 

CAAG CAATGAAAGAAATTCTGTGTGGTACAAACGAGAAAGAACC CC CA^ CAGTG 180 

GCTCAGCTAGCACAAGAACTCTACAGCAGTGGCCTGCTAGTGACACTGATAGCTGACCTG 258 

II 1 1 Ml I M 1 1 1 1 MM I M I II 1 1 1 III 1 1 1 II Ml 1 1 1 1 1 M 1 1 1 1 1 M I ! M M I 

GCTCAGCTAGCACAAGAACTCTACAGCAGTGGCCTGCTGGTGACACTGATAGCTGACCTG 24 0 



1 1 II II II II I II 1 1 1 II I II M 1 1 II III II I III II II III M I III II I Ml I MM 

CAGCTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATATTTAACAACATCTTGAGA 
AGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGTGCTCATCCTCATATCCTG 

I MM I M 1 1 M M I II 1 1 1 M I M 1 1 M Ml Ml I M 1 1 II 1 1 M Ml 1 1 M I II I II 



TTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTACGTTGTGGGATTATGCTG 438 

I !! I III 1 1 1 II 1 1 1 II I IN Ml II II! M Ml Ml! MM 1 1 1 1 1 1 II 1 1 1 II 1 1 1 

TTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTACGTTGTGGGATTATGCTG 42 0 



Illllillllllllllllllllllllll 1 1 1 III Ml MM I II I II 1 1 1 1 1 III II 



GATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCTTCAGATGCCTTTGCTACT 558 

M I I II MM I M 1 1 II M M M II M M II 1 1 M 1 1 1 1 II M M Ml I M II II I M I 

GATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCTTCAGATGCCTTTGCTACT 54 0 

TTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGACTTCTTAGAACAAAATTAC 618 

M M I : M I M I II I II 1 1 II I II I II II 1 1 1 M M M I M 1 1 II I II M I M II I II I 

TTCAAGGATTTACTAACCAGAC^TAAAGTGTTGGTAGCAGACTTCTTAGAACAAAATTAC 600 

GACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAGAATTATGTTACTAAGAGA 678 

1 1 II 1 1 1 M 1 1 1 1 M I M 1 1 M 1 1 1 1 Ml 1 1 1 1 1 1 1 1 1 1 1 1 M M I II 1 1 1 1 II II M I 

GACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAGAATTATGTTACTAAGAGA 660 

CAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCACAACTTTGCCATCATGACA 738 

M 1 1 M 1 1 1 1 Mi I M I M 1 1 M 1 1 1 M M 1 1 1 M 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 M I 

CAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCACAACTTTGCCATGA.TGACA 72 0 

AAGTATATCAGCAAGCCGGAGAACCTGAAACTCATGATGAACCTCCTTCGGGATAAAAGT 7 98 

1 1 1 1 1 1 1 1 1 [ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 E i 1 1 1 1 1 1 1 1 1 1 i 1 1 

AAGTATATCAG CAAG CCGGAGAACCTGAAACT CATGATGAAC CTCCTTCGGGATAAAAGT 78 0 

CCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTTGTGGCCAGTCCTCACAAA 858 

M 1 11 1 M I M II Ml Ml MIM III I il III Mill I II II I II I II I Ml MM II 

CCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTTGTGGCCAGTCCTCACAAA 84 0 

ACACAGCCTATTGTGGAGATCCTGTTAAAAAATCAGCCCAAACTCATTGAGTTTCTGAGC 918 

1 1 II 1 1! I II 1 1 ! II M I Ml M I Ml II 1 1! Ml IM MM I II II 1 1 1 II I 111 II 

ACACAGCCTATTGTGGAGATCCTGTTAAAAAATCAGCCCAAACTCATTGAGTTTCTGAGC 900 



Qy 
Db 

Qy 

Db 



919 AG CTTCCAAAAAGAAAGGACGGATGATGAG CAGTTCG CTGA CGAGAAGAACTACTTGATT 978 

1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

901 AGCTTCCAAAAAGAAAGGACGGATGATGAGCAGTTCGCTGA CGAGAAGAACTACTTGATT 960 
979 AAACAGATCCGAGACTTGAAGAAAACGGCCCCTTGA 1014 

I I M I I I: I I I I I I I I I I I I I I I I I I I I I ! I I I II 

961 AAACAGATCCGAGACTTGAAGAAAACGGCCCCTTGA 996 



RESULT 7 
BC016128 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



REMARK 
COMMENT 



BC016128 1359 bp mRNA 

Mus musculus RIKEN cDNA 1500031K13 gene, 
IMAGE:4911640) , complete cds . 
BC016128 

BC016128 .1 GI : 16359341 
MGC . 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Rodent ia; 
1 (bases 1 to 1359) 
Strausberg, R. L. , Feingold,E.A 



linear ROD 16-APR-2003 
mRNA (cDNA clone MGC : 28889 



Craniata ; Vertebrata ; Euteleostomi ; 
Sciurognathi; Muridae; Murinae; Mus. 



Klausner , R . D . 
Altschul, S.F. 
Hopkins, R.F. , 
Diatchenko,L. 
Stapleton,M. , 
Scheetz,T.E. , 
Carninci , P . , 
Abramson, R.D. 
McKernan, K.J. 
Worley , K. C. , 



Grouse , L . H . , Derge , J . G . , 
, Collins, F.S. , Wagner, L., Shenmen, CM., Schuler,G.D. , 
, Zeeberg,B., Buetow,K.H., Schaef er , C . F . , Bhat,N.K w 
Jordan, H., Moore, T. , Max,S.I., Wang, J., Hsieh,F., 
, Marusina,K., Farmer, A. A. , Rubin, G. M. , Hong,L., 
Soares,M.B. , Bonaldo,M.F. , Casavant , T . L . , 
Browns te in, M.J. , Usdin,T.B. , Toshiyuki,S. , 
Prange , C . , Raha , S . S . , Loquel lano , N . A . , Peters , G . J . , 
, Mullahy , S . J . , Bosak,S.A., McEwan,P.J., 
, Malek,J.A., Gunaratne, P . H . , Richards, S., 
Hale,S., Garcia, A.M., Gay,L.J., Hulyk,S.W., 
Villalon,D.K. , Muzny , D . M . , Sodergren, E . J . , Lu,X., Gibbs,R.A., 
Fahey,J., Helton,E., Ketteman,M., Madan,A. , Rodrigues , S . , 
Sanchez, A., Whiting, M., Madan,A. , Young, A. C. , Shevchenko , Y . , 
Bouf fard,G.G. , Blakesley , R . W . , Touchman , J . W . , Green, E.D. , 
Dickson , M. C. , Rodriguez , A . C . , Grimwood,J., Schmutz,J., Myers, R.M. , 
Butterf ield, Y. S . , Krzywinski , M . I . , Skalska,U. , Smailus,D.E. , 
Schnerch,A., Schein,J.E., Jones, S.J. and Marra,M.A. 
Generation and initial analysis of more than 15,000 full-length 
human and mouse cDNA sequences 

Proc. Natl. Acad. Sci . U.S.A. 99 (26), 16899-16903 (2002) 

22388257 

12477932 

2 (bases 1 to 1359) 
Strausberg, R. 
Direct Submission 

Submitted (22-OCT-2001) National Institutes of Health, Mammalian 
Gene Collection (MGC) , Cancer Genomics Office, National Cancer 
Institute, 31 Center Drive, Room 11A03, Bethesda, MD 20892-2590, 
USA 

NIH-MGC Project URL: http://mgc.nci.nih.gov 

Contact : MGC help desk 

Email : cgapbs-r@mail .nih.gov 

Tissue Procurement: Jeffrey E. Green, M.D. 

cDNA Library Preparation: Life Technologies, Inc. 



cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 
DNA Sequencing by: Genome Sequence Centre, 
BC Cancer Agency, Vancouver, BC, Canada 
inf o@bcgsc . be . ca 

Steven Jones, Jennifer Asano, Ian Bosdet, Yaron Butterfield, 
Susanna Chan, Readman Chiu, Chris Fjell, Erin Garland, Ran Guin, 
Letticia Hsiao, Martin Krzywinski , Reta Kutsche, Oliver Lee, Soo 
Sen Lee, Victor Ling, Carrie Mathewson, Candice McLeavy, Steven 
Ness, Pawan Pandoh, Anna-Liisa Prabhu, Parvaneh Saeedi, Jacqueline 
Schein, Duane Smailus, Michael Smith, Lorraine Spence, Jeff Stott, 
Michael Thorne, Miranada Tsai, Natasja van den Bosch, Jill Vardy, 
George Yang, Scott Zuyderduyn, Marco Marra . 

Clone distribution: MGC clone distribution information can be found 
through the I.M.A.G.E. Consort ium/LLNL at: http://image.llnl.gov 
Series: IRAK Plate: 38 Row: m Column: 23 

This clone was selected for full length sequencing because it 
passed the following selection criteria: Similarity but not 
identity to protein. 
FEATURES Loca t ion/ Qual i f i er s 

source 1 . .1359 

/organism="Mus musculus" 

/mol_type=" mRNA " 

/strain="FVB/N" 

/db_xref ="taxon: 10090" 

/clones "MGC: 2 888 9 IMAGE : 4 91164 0 " 

/tissue_type=" Salivary gland, 10 week old female mouse" 
/clone_l ib= "NCI_CGAP_SG2 " 
/lab_host="DH10B" 
/note= "Vector : pCMV-SP0RT6" 
gene 1. .1359 

/gene=" 150003 lK13Rik" 

/note=" synonyms: 4930520C08Rik / 2810425O13Rik" 
/db_xref="LocusID: 69008" 
/db_xref="MGI : 1916258" 
CDS 262. .1266 

/ codon_s t art = 1 

/product="1500031K13Rik protein" 
/protein_id="AAH16128 . 1" 
/db_xref ="GI : 16359342" 
/db_xref="LocusID: 69008" 

/ trans la t ion= " MPLFSKSHKNPAEI VKILKDNLAI LEKQDKKTDKASEEVSKSLQ 
AMKEI LCGTNDKEPPTEAVAQLAQELYSSGLLVTLI ADLQLI DFEGKKDVTQI FNNI L 
RRQIGTRCPTVEYI SSHPHILFMLLKGYEAPQIALRCGIMLRECI RHEPLAKI I LFSN 
QFRDFFKYVELSTFDIASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSEN 
YVTPCRQSLPCLLGELI LDRHNFTI MTKYI S KPENLKLMMNLLRDKS PN I QFEAFHVFKV 
FVAS PHKTQP I VE I LLKNQPKLI EFLSS FQKERTDDEQFADEKNYLI KQ I RDLKKAAP 



BASE COUNT 
ORIGIN 



418 a 



301 c 



294 g 



346 t 



Query Match 84 . 9%; 

Best Local Similarity 90.5%; 
Matches 918; Conservative 



Score 860.4; DB 10; Length 1359; 
Pred. No. 3.2e-200; 
0; Mismatches 96; Indels 0; Gaps 



0; 



Qy 



ATGAAAAAAATGCCTTTGTTTAGTAAATCA(^CAAAAATCCAGCAGAAATTGTGAAAATC 

III I! I II I II 1 1 1 1 1 1 II; II I II ! II I II III 1 1 1 II 1 1 1 M I Ml I II Mill 



60 



Db 



253 ATGAAAAAAATGCCTTTGTTTAGTAAM 312 



Qy 


61 


Db 


313 


Qy 


121 


Db 


373 


Qy 


181 


Db 


433 


Qy 


241 


Db 


493 


Qy 


301 


Db 


553 


Qy 


361 


Db 


613 


Qy 


421 


Db 


673 


Qy 


481 


Db 


733 


Qy 


541 


Db 


793 


Qy 


601 


Db 


853 


Qy 


661 


Db 


913 


Qy 


721 


DO 


973 


Qy 


781 


Db 


1033 


Qy 


841 


Db 


1093 



CTGAAAGACAATTTGGCCATTTTGGAAAAGCAAGACAAAAAGAC^^ 12 0 

Illllllllll I II I ; II i 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 M 1 1 1 1 

CTGAAAGACAACCTGGCCATTTTGGAAAAGCAAGACAA 372 



II Illll Illll llllllllllllll llllllllllllll II Illll II II 

GAGGTGTQ\AAATCTCTGCAAGCAATGAAGGAAATTCTGTGTGGAACGAACGA(ZAAGGAG 

CCCCCAACAGAAGCAGTGGCTCAGCTAGC^ 

Illll llllllllllllllllllll II II II Illllllllll II I I I I III 



ACACTGATAG CTGACCTG CAG CTGATAGACTTTGAGGGAAAAAAAGATGTGAC C CAGATA 300 

Illll lllllllllllllllll M 1 1 1 1 1 1 II 1 1 1 1 1 1 II 1 1 1 1 1 1 II 1 1 1 III 1 1 ! I 

ACACT CATAG CTGACCTG CAG CTCATAGACTTTGAGGGAAAAAAAGATGTGAC CCAGATA 552 



lllllll lllllllllllll II II III llllllllll Illll II III 



GCT(^TCCTCATATCCTGTTTATGCTCCTau^GGATATGAAGCCCCACAGATTGCCTTA 42 0 

llllllllll llllllllllllll llllllll llllllllllllllllllllllll 

TCTC^TCCTC^CATCCTGTTTATGCTTCTCAAAGGCTATGAAGCCCaVCAGATTGCCTTA 672 

CGTTGTGGGATTATG CTGAGAGAATGTATTCGACATGAAC CACTTG CCAAAAT CATCCT C 48 0 

II llllllllllllll Illll llllllllllllll llllllllllllllllllll 

CGCTGTGGGATTATGCTAAGAGAGTGTATTCGACATGAGCCACTTGCCAAAATCATCCTA 732 

TTTTCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCT 54 0 

Illllllllll llllllllllllll Illll II III II I I II llllllll III 

TTTTCTAATCAGTTCAGAGATTTCTTCAAGTATGTTGAGCTGTCCACCTTTGATATCGCT 792 

TCAGATGCCTTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGAC 600 

Illllllllll llllllll llllllll llllllllllllllll llllllllllll 

TCAGATG CCTTCG CTA CTTTTAAGGATTTGTTAAC CAGACATAAAGTATTGGTAG CAGAC 8 52 

TTCTTAGAACAAAATTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAG 660 

lllllllllllllllll I II I III ! 1 1 1 1 1 II III II I II 1 1 1 1 1 1 II 1 1 1 1 1 1 

TTCTTAGAACAAAATTATGACACTATTTTTGAAGACTATGAGAAACTG CTG CAATCTGAG 912 
AATTATGTTACTAAGAGACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCAC 72 0 

II Mill II llllllll lllllllll lllllll lllllllllllllllll III 

AACTATGTGACAAAGAGACAATCTTTAAAGTTGCTAGGTGAGCTGATCCTGGACCGCCAC 972 



II II I I I I Illll lllllllllllllllll llllllllllllll lllllllll 
AATTT CACCATTATGAC CAAGTATAT CAG CAAG CCAGAGAACCTGAAACTGATGATGAAC 

CTCCTTCGGGATAAAAGTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTT 

II Mill II lllllllllllllllll II llllllll Mill llllllllllll 

CTGCTTCGAGACAAAAGTCCCAACATCCAATTCGAAGCCTTCCATGTCTTTAAGGTGTTT 

GTGGCCAGTCCTCACAAAACA(^GCCTATTGTGGAGATCCTGTTAAAAAATCA.GCCCAAA 
llllllll I I llllllll llllllll llllllll M I I I f I I I I I I I I I II I I I 



Qy 901 CTCATTGAGTTTCTGAGCAGCTTCQ\AAAAGAAAGGACGGATGATGAGCAGTTCGCTGAC 960 

lllllllllllllllllllllll II Mlllllllll II II llllllll MUM 

Db 1153 CTCATTGAGTTTCTGAGCAGCTTTCAGAAAGAAAGGACAGACGACGAGCAGTTTGCTGAC 1212 

Qy 961 GAGAAGAACTACTTGATTAAACAGATCCGAGACTTGAAGAAAACGGCCCCTTGA 1014 

IIIIIIIIIIII IIIIIIIIIIIII llllllll III I III I Mill Ml 
Db 1213 GAGAAGAACTAC CTGATTAAACAGATTCGAGACTTGAAGAAAG CAGCCCCGTGA 1266 



RESULT 8 
BC016546 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



REMARK 
COMMENT 



AltSChul,S.F. 
Hopkins , R . F . , 
Diatchenko , L . 
Stapleton,M. , 
Scheetz,T.E. , 



BC016546 1530 bp mRNA linear ROD 16-APR-2003 

Mus musculus RIKEN cDNA 1500031K13 gene, mRNA (cDNA clone MGC: 27972 
IMAGE : 3595339 ) , complete cds . 
BC016546 

BC016546 .1 GI : 167414 56 
MGC. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

1 (bases 1 to 1530) 

Strausberg,R.L. , Feingold, E . A. , Grouse, L.H., Derge,J.G., 
Klausner,R.D. , Collins , F . S . , Wagner,L., Shenmen , C . M . , Schuler , G . D . , 
, Zeeberg,B., Buetow,K.H., Schaef er , C . F . , Bhat,N.K. , 
Jordan, H. , Moore,T., Max,S.I., Wang, J., Hsieh,F., 
, Marusina,K., Farmer, A. A. , Rubin, G.M., Hong,L., 
Soares , M . B . , Bonaldo , M . F . , Casavant , T . L . , 
Browns tein, M.J. , Usdin,T.B. , Toshiyuki,S. , 
Carninci,P., Prange,C, Raha,S.S., Loquellano,N . A. , Peters, G. J., 
Abramson, R.D. , Mul lahy , S.J. , Bosak,S.A. , McEwan, P.J. , 
McKernan, K. J. , Malek, J. A. , Gunaratne, P.H. , Richards,S. , 
Worley,K.C. # Hale,S., Garcia,A.M., Gay,L.J. # Hulyk,S.W., 
Villalon,D.K. , Muzny , D . M . , Sodergren, E . J . , Lu,X. , Gibbs,R.A. , 
Fahey,J., Helton^., Ketteman,M. , Madan,A. , Rodrigues , S . , 
Sanchez , A . , Whiting,M. # Madan,A. , Young,A.C. # Shevchenko , Y . , 
Bouf fard,G.G. , Blakesley,R.W. , Touchman, J. W. , Green, E.D. , 
Dickson,M.C. , Rodriguez , A. C . , Grimwood,J., Schmutz,J. # Myers, R.M., 
Butterf ield, Y.S. , Krzywinski , M. I . , Skalska,U. , Smailus,D.E. , 
Schnerch,A., Schein,J.E., Jones,S.J. and Marra,M.A. 
Generation and initial analysis of more than 15,000 full-length 
human and mouse cDNA sequences 

Proc. Natl. Acad. Sci . U.S.A. 99 (26), 16899-16903 (2002) 

22388257 

12477932 

2 (bases 1 to 1530) 
Strausberg,R. 
Direct Submission 

Submitted (31-OCT-2001) National Institutes of Health, Mammalian 
Gene Collection (MGC) , Cancer Genomics Office, National Cancer 
Institute, 31 Center Drive, Room 11A03, Bethesda, MD 20892-2590, 
USA 

NIH-MGC Project URL: http://mgc.nci.nih.gov 

Contact : MGC help desk 

Email: cgapbs-r@mail.nih.gov 

Tissue Procurement: Jeffrey Green M.D. 



cDNA Library Preparation: Life Technologies, Inc. 

cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 

DNA Sequencing by: Baylor College of Medicine Human Genome 

Sequencing Center 

Center code: BCM-HGSC 

Web site: http://www.hgsc.bcm.tmc.edu/cdna/ 
Contact : amg@bcm . tmc . edu 

Gunaratne, P.H., Garcia, A.M., Lu, X., Hulyk, S.W., Loulseged, H., 
Kowis, C.R., Sneed, A. J. , Martin, R.G. , Muzny, D.M., Nanavati, 
A.N. , Gibbs, R.A. 



FEATURES 

source 



gene 



CDS 



Clone distribution: MGC clone distribution information can be found 
through the I.M.A.G.E. Consort ium/LLNL at: http://image.llnl.gov 
Series: IRAK Plate: 35 Row: m Column: 15 

This clone was selected for full length sequencing because it 
passed the following selection criteria: Similarity but not 
identity to protein. 

Location/Qualifiers 

1. .1530 

/organism="Mus musculus" 
/ mo 1 _t yp e = 11 mRNA 11 
/strain=" FVB/N" 
/db_xref="taxon: 10090" 
/clone="MGC: 27972 IMAGE : 359533 9 " 

/tissue_type="Mammary tumor. C3(l)-Tag model. Infiltrating 

ductal carcinoma. 5 month old virgin mouse." 

/c 1 one_l ib= " NCI_CGAP_Mam6 " 

/Iabjiost-"DH10B" 

/note= "Vector: pCMV-SP0RT6" 

1. .1530 

/gene="1500031K13Rik" 

/not e=" synonyms: 4930520C08Rik, 28 10425O13Rik" 

/db__xref-"LocusID: 69008" 

/db_xref="MGI : 1916258" 

279. .1283 

/ codon_start=l 

/product = " 1500031 K13Rik pro t e in " 
/protein_id="AAH16546 .1" 
/db_xref="GI : 16741457" 
/db_xref="LocusID: 69008" 

/translat ion="MPLFSKSHKNPAEI VKILKDNLAILEKQDKKTDKASEEVSKSLQ 
AMKE I LCGTNDKE P PTEAVAQLAQEL YS SGLLVTL I ADLQL I DFEGKKD VTQ I FNN I L 
RRQIGTRCPTVEYISSHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKI ILFSN 
QFRDFFKYVELSTFDIASDAFATFKDLLTRHKVXVADFLEQNYDTIFEDYEKLLQSEN 
YVTKRQSLKLRGELI LDRHNFTI MTKYI SKPENLKLMMNLLRDKS PNI QFEAFHVFKV 
FVASPHKTQPIVEILLKNQPKLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKAAP 



BASE COUNT 
ORIGIN 



498 a 



313 c 



326 g 



393 t 



Query Match 84.7%; Score 858.8; DB 10; Length 1530; 

Best Local Similarity 90.4%; Pred. No. 8e-200; 

Matches 917; Conservative 0; Mismatches 97; Indels 0; Gaps 0; 

Qy 1 ATGAAAAAAATGCCTTTGTTTAGTAAATCACACAAAAAT 60 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I M I I I I II I I I I I I I 

Db 270 ATGAAAAAAATGCCTTTGTTTAGTAAATCACACAAAAATCCAGCA 32 9 



Ov 61 CTGAAAGACAATTTGGCCATTTTGG 120 

M 1 1 1 1 1 1 1 II IIMIIIIIMIIIIIIIIIIIIIIIIIIIIIMIMIIIIIIMII 

Db 33 0 CTGAAAGACAACCTGGCCATTTTGGAAAAGCAAGACAAAAAGACAGAC 389 

q y 121 GAAGTGTCTAAATCACTGCAAGCAATGAAAGAAATTCTGTGTGGTACAAACGAGAAAGAA 18 0 

II MM! Mill 1 1 1 1 1 1 1 M 1 1 1 1 1 IIMIIIIMIMI II Mill II M 

D b 390 GAGGTGTCAAAAT CT CTG CAAG CAATGAAGGAAATTCTGTGTGGAACGAACGACAAGGAG 44 9 

Ov 181 CCCC(^(^GAAGCAGTGGCTCAGCTAGCA(^GAACTCTACAGCAGTGGCCTGCTAGTG 24 0 

Mill IMIIIIMMIIIIIIIII II II II MUM M MM III 

Db 450 CCCCCTACAGAAGCAGTGGCTCAGCTGGCGCAGGAGCTCTACAGCAGCGGGTTGCTGGTG 509 

Qy 241 ACACTGATAGCTGACCTG CAG CTGATAGACTTTGAGGGAAAAAAAGATGTGACC CAGATA 300 

Mill MMMIMMMMM II II II II II II I II II M II I II II M II II I II I 

D b 510 ACACTCATAGCTGACCTG CAG CTCATAGACTTTGAGGGAAAAAAAGATGTGACC CAGATA 569 

Qy 301 TTTAACAACATCTTGAGAAGACAGATAGG CACT CGGAGTC CTACTGTGGAGTATATTAGT 3 60 

II I MIMMIMIM II M Ml MINIM! Mill II III 

Db 570 TTCAACAACATCCTGAGAAGACAGATTGGTACACGGTGTCCTACTGTCGAGTACATCAGT 62 9 

Ov 3 61 GCTCATCCTCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTA 420 

1 1 1 1 1 1 1 1 1 1 MINIMUM 1 1 1 1 1 1 M I N INI I IN i 1 1 il I N III IN 

Db 63 0 TCTCATCCTCACATCCTGTTTATGCTTCTCAAAGGCTATGAAGCCCCACAGATTGCCTTA 689 

Qy 421 CGTTGTGGGATTATG CTGAGAGAATGTATTCGACATGAAC CACTTG C CAAAATCAT C CT C 48 0 

II II II 1 1 II I M I M Mill IMMMMIMM I II II I II I M 1 1 II II II I 

Db 690 CG CTGTGGGATTATG CTAAGAGAGTGTATTCGACATGAGC CACTTG C CAAAATCAT C CTA 74 9 

Qy 481 TTTTCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCT 54 0 

II I II II I II I Mllllllllllll Mill M III MM II llllllll III 

Db 750 TTTTCTAATCAGTTCAGAGATTTCTTCAAGTATGTTGAGCTGTCCACCTTTGATATCGCT 8 09 

Q Y 541 TCAGATGCCTTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGAC 6 00 

I i I M 1 1 1 1 1 1 llllllll llllllll 1 1 1 1 II 1 1 1 1 1 i 1 1 llllllllllll 

Db 810 TCAGATGCCTTCGCTACTTTTAAGGATTTGTTAACCAGACATAAAGTATTGGTAGCAGAC 8 69 

Qy 601 TTCTTAGAACAAAATTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAG 660 

II III MM Ml III II MM III MMMIMMMMM I II MM II II Ml I 

Db 87 0 TTCTTAGAACAAAATTATGACACTATTTTTGAAGACTATGAGAAACTGCTGCAATCTGAG 92 9 

QV 661 AATTATGTTACTAAGAGACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCAC 72 0 

M Mill 1 1 llllllll NNNN Ml III 1 1 1 i 1 1 1 1 i S 1 1 1 1 1 1 i Ml 

Db 93 0 AACTATGTGACAAAGAGACAATCTTTAAAGTTGCGAGGTGAGCTGATCCTGGACCGCCAC 98 9 

QV 721 AACTTTGCCATCATGACAAAGTATATCAGC 78 0 

II II MM Mill II ill I III IN III II llllllllllll IN 

Db 990 AATTTCACCATTATGACCAAGTATAT CAG CAAGC CAGAGAACCTGAAACTGATGATGAAC 104 9 

Qy 781 CTCCTTCGGGATAAAAGTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTT 84 0 

II Mill II IIIIIIIMIIIIIIM II llllllll Mill I M I i I i 1 1 1 1 1 

Db 105 0 CTGCTTCGAGACAAAAGTCCCAACATCCAATTCGAAGCCTTCCATGTCTTTAAGGTGTTT 1109 

Qy 841 GTGGCCAGTCCT^CAAAACACAGCCTATTGTGGAGATCCTGTTAAAAAATCAGCCO^ 900 

llllllll II llllllll llllllll IIIIMM IN II I II I III II I III N 

Db 1110 GTGGC(^GCCCCCACAAAACGCAGCCTATCGTGGAGATTCTGTTAAAAAATCAGCCa^ 1169 



Qy 901 CTCATTGAGTTTCTGAG CAG CTTC CAAAAAGAAAGGACGGATGATGAG CAGTTCG CTGAC 960 

Illllllllllllllllllllll II lllllllllll II II IIIIIMI llllll 

Db 117 0 CTCATTGAGTTTCTGAG CAG CTTT CAGAAAGAAAGGACAGACGACGAG CAGTTTG CTGAC 1229 

Qy 961 GAGAAGAACTACTTGATTAAACAGATCCGAGACTTGAAGAAAACGGCCCCTTGA 1014 

Illlllllllli lllllllllllll llllllllillllll I Mill III 
Db 123 0 GAGAAGAACTACCTGATTAAACAGATTCGAGACTTGAAGAAAGCAGCCCCGTGA 1283 



RESULT 9 
BD147463 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 
JOURNAL 

COMMENT 



BD147463 822 bp DNA linear PAT 17-JAN-2003 

Primer for synthesizing full-length cDNA and use thereof. 
BD147463 

BD147463.1 GI : 278 53221 
JP 2002191363-A/2306 . 
Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 822) 

Ota,T., Isogai,T., Nishikawa , T . , Hayashi,K., Saito,K., Yamamoto, J. , 
Ishii,S., Sugiyama,T., Wakamatsu # A. , Nagai,K. and Otsuki,T. 
Primer for synthesizing full-length cDNA and use thereof 
Patent: JP 2002191363-A 2306 09-JUL-2002; 
HELIX RESEARCH INSTITUTE 
OS Homo sapiens (human) 
PN JP 2002191363-A/2306 
09-JUL-2002 

28-JUL-2000 JP 2000280990 

TOSHIO OTA , TAKAO ISOGAI , TETSUO NISHIKAWA, KOJI HAYASHI , KAORU 
SAITO, 

JUNICHI YAMAMOTO, SHI ZUKO I SHI I , TOMOYASU SUGIYAMA,AI WAKAMATSU, 
KEIICHI NAGAI , TETSUJI OTSUKI 



PD 
PF 
PI 
PI 
PI 
PI 
PC 

C12N15/09 / C07K14/47 / C07K16/18 / C12Nl/l5 / C12Nl/l9,C12Nl/21,C12N5/ PC 
10, 

PC C12P2l/02,C12Ql/68//C12P2l/08,G06F17/30,C12N15/00,C12N5/00 CC 
Primer for synthesizing full-length cDNA and use thereof FH Key 
Location/Qualif iers 



FT 
FT 



source 



1. .822 

/organism= ' Homo sapiens (human) 



FEATURES 

source 



BASE COUNT 
ORIGIN 



Location/Qualifiers 
1. .822 

/organism="Homo sapiens" 
/mol_type="genomic DNA" 
/db_xref="taxon: 9606" 
268 a 164 c 171 g 216 t 



3 others 



Query Match 76.0%; 
Best Local Similarity 98.5%; 
Matches 798; Conservative 



Score 770.6; DB 6 
Pred. No. 3.4e-178 
0; Mismatches 10 



Length 822; 

Indels 2; Gaps 2; 



Qy 

Db 



19 TTTAGTAAATCACACAAAAATCCAGCAG 78 

I M 1 1 1 1 1 1 1 1 1 1 1! I II I M 1 1 1 1 1 1 1 1 i 1 1 1 1 i 1 1 II I II 1 1 1 1 M 1 1 1 II 1 1 1 1 1 1 

1 TTTAGTAAATCACACAAAAATCCAGCAGAAATTGTGAAA^ 60 



Qy 


79 


Db 


61 


Qy 


139 


Db 


121 


Qy 


199 


Db 


181 


Qy 


259 


Db 


241 


Qy 


319 


Db 


301 


Qy 


379 


Db 


361 


Qy 


439 


Db 


421 


Qy 


499 


Db 


481 


Qy 


559 


Db 


541 


Qy 


619 


Db 


601 


Qy 


679 


Db 


661 


Qy 


739 


Db 


721 


Qy 


799 


Db 


779 



ATTTTGGAAAA.GCAAGACAAAAAGACAGACAAGGCTTCAGAAGAAGTGTCTAAATCACTG 138 

llliMIIIIIIIIIIIIIIII llllllllllllllll IIIIMMIIIIIMI Mill 

ATTTTGGAAAAG CAAGACAAAAAGACAGACAAGG CTT CAGAAGAAGTGT CTAAATCACTG 12 0 
CAAGCAATGAAAGAAATTCTGTGTGGT^ 198 

I ! 1 1 1 1 1 1 , 1 ! 1 1 i i 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 M 1 1 II 1 1 1 1 1 1 1 1 M 1 1 II 1 1 1 1 

CAAGCAATGAAAGAAATT CTGTGTGGTACAAACGAGAAAGAACC CCCAACAGAAGCAGTG 18 0 
GCTCAGCTAGCAC^AGAACTCTACAGCAGTGGCCTGCTAGTGACACTGATAGCTGACCTG 258 

1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 Ml 1 1 1 1 1 1 1 1 1 lllllllllllllllllllll 

GCTCAGCTAGCACAAGAACTCTACAGCAGTGGCCTGCTGGTGACACTGATAGCTGACCTG 240 



MMI 1 1 1 M Ml 1 1 , 1 1 MIMI MMII 1 1 1 II 1 1 1 II I III 1 1 II Ml M Ml Ml 



AGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGTGCTCATCCTCATATCCTG 378 

1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 M 1 1 1 . MM M 1 1 1 M I M I II 1 1 M 1 1 1 1 1 1 1 1 1 1 1 Ml 1 1 1 

AGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGTGCTCATCCTCATATCCTG 36 0 
TTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTACGTTGTGGGATTATGCTG 438 

I M MM I M I M 1 1 1 1 1 1 1 II MM 1 1 1 M M 1 1 1 1 M M Ml 1 1 1 1 M 1 1 1 1 1 II Ml 

TTTATG CT C CTCAAAGGATATGAAGCC C CACAGATTG CCTTACGTTGTGGGATTATGCTG 420 
AGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTCTTTTCTAATCAATTCAGA 4 98 

Illlllllllllllllllllllllilll I MM I M I M 1 1 M 1 1 M 1 1 1 1 1 1 M II 1 1 1 

AGAGAATGTATTCGA(^TGAACCACTTGTCAAAATCATCCTCTTTTCTAAT(LAATTCAGA 48 0 
GATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCTTCAGATGCCTTTGCTACT 558 

M 1 1 1 M II 1 1 1 1 1 M. M M 1 1 1 1 1 1 1 1 1 1 1 1 1 M M, I Ml 1 1 M I M I II I II 1 1 1 1 M 

GATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCTTCAGATGCCTTTGCTACT 54 0 
TTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGACTTCTTAGAACAAAATTAC 618 

Ml I II 'Ml 1 1 1 1 1 IN II 1 1 1 1 1 1 II 1 1 1 1 IIMM I M 1 1 1 Ml 1 1 1 II III 1 1 1 1 

TTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGACTTCTTAGAACAAAATTAC 600 
GACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAGAATTATGTTACTAAGAGA 678 

1 1 1 1 1 1 1 1 M I M 1 1 M 1 1 II I M I II 1 1 1 1 . 1 M I II 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 

GACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAGAATTATGTTACTAAGAGA 66 0 
CAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCACAACTTTGCCATCATGACA 73 8 

II MM 1 1 MMM M I II I III II III llllllllll III 1 1 1 III II M I M M M 

CAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCACAACTTTGCCATCATGACA 72 0 
AAGTATATCAGCAAGCCGGAGAACCTGAAACTCATGATGAACCTCCTTCGGGATAAAAGT 7 98 

Illlllllllllllillllllllllll llllllllllllllll lllllllli Mill 

AAGTATATCAGCAAGCCGGAGAACCTG - AACTCATGATGAACCTNCTTCGGGAT - AAAGT 778 
CCCAACATCCAGTTTGAAGCCTTTCATGTT 828 

Illllllllllllllll Ml III 

CCCAACATCCAGTTTGAGCCTTCTGGTTTT 8 08 



RESULT 10 
BD079551 

LOCUS BD07 9551 



831 bp DNA linear PAT 27-AUG-2002 



DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 
JOURNAL 



COMMENT 



FEATURES 

source 



BASE COUNT 
ORIGIN 



Craniata ; Vertebrata ; Euteleostomi ; 
Catarrhini; Hominidae; Homo. 



Cancer-associated nucleic acids and polypeptides 
BD079551 

BD079551. 1 GI : 22625154 
JP 2001516009-A/217. 
Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 
1 (bases 1 to 831) 

01d,L.J., Scanlan, M . J . , Stockert , E . , Gure,A. , Chen,Y.T. , Gout, I 
OGhare,M w Obata,Y., Pf reundschuh, M . , Tureci,0. and Sahin,U. 
Cancer-associated nucleic acids and polypeptides 
Patent: JP 2001516009-A 217 25-SEP-2001; 
LUDWIG INSTITUTE FOR CANCER RESEARCH 
OS Homo sapiens (human) 
PN JP 2001516009-A/217 
25-SEP-2001 

15-JUL-1998 JP 2000503425 

17-JUL-1997 US 08/896164, 10-OCT-1997 US 
60/061765, 10-OCT-1997 US 
9721697. 2, 22-JUN-1998 US 
J OLD, MATTHEW J SCANLAN, ELISABETH STOCKERT, ALI GURE , YAO 
CHEN, 

PI IVAN GOUT, MICHAEL 0 ' HARE , YUICHI OBATA, MICHAEL PFREUNDSCHUH, PI 

OZLEM TURECI, 
PI UGUR SAHIN 
PC 

G01N33/574,A61K38/00,A61K3 9/3 95,A61K3 9/3 95,A61K45/00,A61K48/00, PC 
A61P35/00, 

PC C07K14/82,C07K16/32,C12N15/09//C07K16/4 6,C12P21/08 / A61K37/02, 
C12N15/00 

Cancer-associated nucleic acids and polypeptides. FH Key 

Locat ion/ Qual i f iers 
source 1. .831 

/organism^ ' Homo sapiens (human)'. 
Locat ion/ Qua 1 i f iers 
1. .831 

/ o rgan i s m= " Homo s ap i ens " 
/mol__type="genomic DNA" 
/db_xref ="taxon: 9606" 
285 a 165 c 167 g 209 t 5 others 



PD 
PF 
PR 

10- OCT-1997 US 

11- OCT-1997 GB 



60/061599 
08/948705 PR 
09/102322 PI 
PI 



PR 

LLOYD 
TSENG 



PC 
CC 

FT 
FT 



Query Match 67.5%; 
Best Local Similarity 96.1%; 
Matches 764; Conservative 



Score 684.6; DB 6; 
Pred. No. 4 . le-157; 
0; Mismatches 23; 



Length 831; 
Indels 8; 



Gaps 



6; 



Qy 
Db 



1 ATGAAAAAAATGCCTTTGTTTAGTAAATCACACAAAAATCCAG 60 

M 1 1 1 1 1 Ml I M I II II 1 1 1 : II i 1 1 1 II 1 1 1 Mill 1 1 1 1 1 II 1 1 1 1 II Ml 1 1 II 1 1 

37 ATGAAAAAAATG C CTTTGTTTAGTAAATCACACAAAAATCCAGCAGAAATTGTGAAAATC 96 



Qy 
Db 

Qy 



61 CTGAAAGACAATTTGG CCATTTTGGAAAAGCAAGACAAAAAG CTTCAGAA 120 

1 1 1 i 1 1 1 1 1 i 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M I S 1 1 1 i I i 1 1 1 1 1 1 1 M 1 1 

97 CTGAAAGACAATTTGG CCATTTTGGAAAAG CAAGACAAAAAGACAGACAAGG CTTCAGAA 156 
121 GAAGTGTCTAAATCACTGCAAGCAATGAAAGAAATTCTGTGTGGTACAAACGAGAAAGAA 180 

1 1 1 MM 1 1 1 1 1 1 II MM I Mi I Mill II I II III II 1 1 1 II! 1 1 III II 1 1 1 !l 1 1 



157 GAAGTGTCTAAATCACTGC^^ 216 
CCC CCAACAGAAGCAGTGG CTCAG CTAG CACAAGAACTCTACAGCAGTGGCCTG CTAGTG 24 0 

I I M 1 1 1 1 1 1 ' 1 1 1 1 i Ml M 1 1 1 1 1 1 1 M 1 1 1 1 1 1 ! 1 1 1 1 1 1 Ml 1 1 1 1 1 M 1 1 1 1 1 1 

CCC CCAACAGAAGCAGTGG CTCAG 276 
ACACTGATAGCTGACCTGCAGCTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATA 3 00 

MM I MM 1 1 1 1 1 1 1 II M I IMMM Ml 1 1 M II M 1 1 1 1 1 1 ! M ! 1 1 ! M 1 1 M 1 1 1 

ACACTGATAGCTGACCTGCAGCTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATA 336 
TTTAACAACATCTTGAGAAGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGT 360 

1 1 1 1 ! 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 Mill 1 1 1 1 M 1 1 1 1 1 M 1 1 1 M 1 1 1 1 ! ! 

TTTAACAACATCTTGAGAAGACAGATAGG CACT CGGAGT CCTACTGTGGAGTATATTAGT 3 96 
GCTCATCCTCA.TATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACA.GATTGCCTTA 42 0 

! Ml I Ml III II 1 1 II Ml II! II I II I II MM 111 II 1 1 MM ! II M! Mill il 

GCTC^TCCTCATATCCTGTTTATGCTCCTCAJ^GGATATGAAGCCCCACAGATTGCCTTA 456 
CGTTGTGGGATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTC 4 8 0 

! ! 1 1 ! 1 i , 1 1 . 1 i ' I 'i i 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

CGTTGTGGGATTATG CTGAGAGAATGTATT CGACATGAACCACTTG CCAAAATCATC CTC 516 
TTTTCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCT 54 0 

II I II I II M Ml I III M 1 1! 1 1 II I Ml MM MM 1 1 1 IMM 1 1 Ml II MMII 

TTTTCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCT 576 
TCAGATGCCTTTGCTACTTTCAA -GGATTTACTAACCAGACATAAAGTGTTGGTAGC - AG 598 

I Ml! MM Ml M I M M I M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 II 

TCAGATGCCTTTGCTACTTTCAAGGGATTTACTAACCAGACATAAAGTGTTGGTAGCAAG 63 6 
ACTT CTTAGAACAAAATTACGACACTATTTTTGAAGACTATGAGAAATTG CTT CAGT CTG 658 

I I I M M 1 1 M M 1 1 M I M I M I M 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

ACTTCTTAGAACAAAATTACGACACTANTTTTGAAGACTATGAGAAATTGCTTCAGTCTG 696 
AG -AATTATGTTACTAAGAGACAGTCTTTAAAG - CTGCTAGGGGAGCTGATCCTGGACCG 716 

II lllllllllll lllllllllll MMII llllll MM Mill MUM 

AGAAATTATGTTACCAAGAGACAGTCCTTAAAGCCTGCTAAGGGAACTGATTCTGGACCG 756 
TCACAACTTTGCCATC - ATGACAAAGTATATCAGCAAGCC GGAGAACCTGAAACTCA 772 

IN MINIM MM I I Mill Mill llllll III I I 1 1 I 

TCANAACTTTGCCATCAANGCAAAAGTTTATCAAC^ 816 
TGATGAACCTCCTTC 78 7 

II MM M I 

GGAGGAACCTCCTTC 831 



RESULT 11 
AX061831 

LOCUS AX061831 1026 bp DNA linear PAT 24-JAN-2001 

DEFINITION Sequence 1 from Patent WO0078947. 

ACCESSION AX061831 

VERSION AX061831.1 GI: 12539911 
KEYWORDS 

SOURCE Homo sapiens (human) 

ORGANISM Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 



Db 


157 


Qy 


181 


Db 


217 


Ov 


241 


Db 


277 


Qy 


301 


Db 


337 


Ov 


361 


Db 


397 


Ov 


421 


Db 


457 


Qy 


481 


Db 


517 


Ov 


541 


Db 


577 


Qv 


599 


Db 


637 


Qy 


659 


Db 


697 


Qy 


717 


Db 


757 


Qy 


773 


Db 


817 



Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
REFERENCE 1 

AUTHORS den Daas,I., Fischer, V., Seyfried,C. and von Melchner , L. 
TITLE Head trauma induced cytoplasmat ic calcium binding protein 

JOURNAL Patent: WO 0078947-A 1 28-DEC-2000; 
MERCK PATENT GmbH (DE) 
FEATURES Loca t ion/Qua 1 i f i ers 

source 1 • • 1026 

/ organ ism=" Homo sapiens" 
/mol_type="genomic DNA" 
/db_xref="taxon: 9606" 
CDS 1- .1026 

/ not e= "unnamed protein product" 
/ codon_start=l 
/protein_id=" CAC25030 .1" 
/db_xref="GI : 12539912" 

/ trans la tion= "MPFPFGKSHKS PADIVKNLKESMAVLEKQDISDKKAEKATEEVS 
KNLVAMKEILYGTNEKEPQTEAVAQLAQELYNSGLLSTLVADLQLIDFEGKKDVAQIF 
NNILRRQIGTRTPTVEYICTQQNILFMLLKGYESPEIALNCGIMLRECIRHEPLAKI I 
LWSEQFYDFFRYVEMSTFDIASDAFATFKDLLTRHKLLSAEFLEQHYDRFFSEYEKLL 
HSENYVTKRQSLKLLGELLLDRHNFTIMTKYISKPENLICLMMNLLRDKSRNIQFEAFH 
VFKVFVANPNKTQPI LDI LLKNQAKLI EFLSKFQNDRTEDEQFNDEKTYLVKQI RDLK 
RPAQQEA" 

BASE COUNT 359 a 199 c 203 g 265 t 

ORIGIN 

Query Match 57.5%; Score 582.6; DB 6; Length 1026; 

Best Local Similarity 74.7%; Pred. No. 4.3e-132; 

Matches 748; Conservative 0; Mismatches 244; Indels 9; Gaps 1; 
Q V 18 GTTTAGTAAATCACACAAAAATCCAGCAGAAATTGTGA 77 

1 1 1 1 i ii it mill iiimiii iiiiim i mil ii i 1 1 1 1 

Db 12 GTTTGGGAAGTCTCACAAATCTCCAGCAGACATTGTGAAGAATCTGAAGGAGAGCATGGC 71 
Ov 78 CATTTTGGAAAAG CAAGAC AAAAAGACAGACAAGGCTTCAGAAGAAGTGTC 12 8 

II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Mill MM IIIIM MIIIIMM II 

Db 72 TGTT CTGGAAAAG CAAGACATTT CTGATAAAAAAG CAGAAAAGG CTACAGAAGAAGTTTC 131 

Qy 12 9 TAAATCACTGCAAGCAATGAAAGAAATTCTGTGTGGTACAAACG^ 188 

Ml III II 1 1 1 ' 1 1 1 1 1 ! 1 1 1 1 1 ! III Mill II Mill II I II 

Db 13 2 CAAAAATCTGGTTGCCATGAAAGAAATTCTGTATGGCACAAATC 191 

Qy 18 9 AGAAG CAGTGG CTCAG CTAG CACAAGAACT CTACAGCAGTGGC CTG CTAGTGACACTGAT 24 8 

lllllllll Mill II II IIIIIM IN I I Mill II II II III I 

Db 192 AGAAG CAGTAGCTCAACTTGCTCAAGAACTCTATAATAGTGGGCTCCTTAGCACCCTGGT 251 

Qy 24 9 AGCTGACCTGCAGCTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATATTTAACAA 3 08 

IIIIM MM 1 1 II MM III I II I MM I Ml III I II M II Ml II 

Db 252 AGCTGATTTA(^GCTCATTGACTTTGAGGGCAAAAAAGACGTGGCTCAAATTTTCAACAA 311 

Qy 309 CATCTTGAGAAGACAGATAGG CACT CGGAGT C CTACTGTGGAGTATATTAGTGCT CAT CC 368 

II I I II I M M II II II II lllllllll M lllllllll 

Db 312 TATTCTCAGAAGACAAATTGGTACGAGAACTCCTACTGTTGAATACATCTGCACCCAACA 371 

Qy 369 TCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTACGTTGTGG 428 

MM MM IM I I Mill IIIIM I III I II II II MMM 

Db 3 72 GAATATTTTGTTCATGTTATTGAAAGGGTATGAATCTCCAGAAATAGCTCTAAATTGTGG 431 



Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 
Db 

Qy 
Db 



429 GATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTCTTTTCTAA 488 

II III I II II 1 1 II II II I II I II III MM I I II I M 1 1 II M I 

432 AATAATGTTAAGAGAATGCATCAGACATGAACCACTTGCAAAAATCATTTTGTGGTCGGA 491 

489 TCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCTTCAGATGC 54 8 

|| || IMIIMI I M II II 1 M I I I I M 1 I I I M M I II II I II I 

4 92 ACAGTTTTATGATTTCTTCAGATATGTCGAAATGTCAACATTTGACATAGCTTCAGATGC 551 

54 9 CTTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGACTTCTTAGA 608 

Mill II llllllllllllll II I III III 1 1 II I Mill II II 1 1 

552 ATTTGCCACATTCAAGGATTTACTTACAAGACATAAATTGCT^ 61 1 

609 ACAAAATTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAGAATTATGT 668 

Ml 1 1 1 1 || I 1 1 1 1 II IMIIMI 1 1 Mill II M IMIIMI 

612 ACAG CATTATGATAGATTTTTCAGTGAATATGAGAAGTTACTT CATT CAGAAAATTATGT 671 
669 TACTAAGAGACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCACAACTTTGC 728 

II II IMIIMI I Mill II M II M Mill I IMIIMI I 

672 GACAAAAAGACAGTCACTGAAGCTTCTCGGTGAACTACTACTAGATAGACACAACTTCAC 731 



729 CATCATGACAAAGTATAT CAG CAAG CCGGAGAAC CTGAAACTCATGATGAACCT CCTTCG 

M IMIIMI M Mill II II MUM Ml I MIMIMIM M II 

732 AATTATGACAAAATACAT CAGTAAAC CTGAGAACCTCAAATTAATGATGAAC CTGCTG CG 



789 



788 



791 



848 



792 



849 



852 



II lllllll , I '. 1 1 1 1 1 1 1 1 1 1 1 1 IMIIMI 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 MM 

AGACAAAAGTCGCAACATCCAGTTTGAGGCCTTTCACGTTTTTAAGGTGTTTGTAGCCAA 
TCCTCACAAAACACAGCCTATTGTGGAGATCCTGTTAAAAAATCAGCC(^AACTCATTGA 

MM MM II Mill M I II Mill I II II Ml MMMMII II 

TCCTAACAAGACGCAGCCCATCCTAGACATCCTCCTCAAGAACCAGGCCAAACTCATAGA 



909 GTTTCTGAG CAG CTTC CAAAAAGAAAGGACGGATGATGAG CAGTT CGCTGACGAGAAGAA 

Ml II MM M II II II IMIIMI MIMIMIM MMMMII 

912 GTTCCTCAG CAAGTTT CAGAACGACAGGACGGAGGATGAG CAGTTTAACGACGAGAAGAC 
969 CTACTTGATTAAACAGATC CGAGACTTGAAGAAAACGG CC C 1009 

III II MIMIMIM I II lllllll I I M I 

972 CTATTTAGTTAAACAGAT CAGGGATTTGAAGAGAC CAG CTC 1012 



851 



908 



911 



968 



971 



RESULT 12 

AX082322 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



linear PAT 28-FEB-2001 



AX082322 3281 bp DNA 

Sequence 26 from Patent WO0111032. 
AX082322 

AX082322 .1 GI : 131844 99 



Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 

Hodgson,D.M. , Lincoln, S . E . , Russo,F.D. # Spiro,P.A., Banville , S . C . 
Bratcher,S.R. , Dufour,G.E., Cohen, H. J. , Rosen, B.H., Chalup,M.S., 
Hillman, J.L. , Jones, A. L. , Yu,J.Y., Greenawalt , L . B . , Panzer, S.R., 



Roseberry, A.M. , Wright, R.J. and Daniels , S . E . 
TITLE Secretory molecules 

JOURNAL Patent: WO 0111032-A 26 15-FEB-2001; 
Incyte Genomics, Inc. (US) 
FEATURES Loca t ion/Qua 1 i f iers 

source 1. .3281 

/organism="Homo sapiens" 
/mol_type= "genomic DNA" 
/db_xref="taxon: 9606" 
/note="Incyte ID No: 481257.3" 
BASE COUNT 1014 a 601 c 676 g 990 t 

ORIGIN 

Query Match 57.5%; Score 582.6; DB 6; Length 3281; 

Best Local Similarity 74.7%; Pred. No. 4.3e-132; 

Matches 748; Conservative 0; Mismatches 244; Indels 9; Gaps 1; 
Qy 18 GTTTAGTAAATCACACAAAAATCCAGCAGAAAT^ 77 

1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 I 1 1 1 1 

Db 101 GTTTGGGAAGTCTCACAAATCTCCAGCAGACATTGTGAAGAATCTGAAGGAGAGC^TGG^ 160 
Qy 78 CATTTTGGAAAAGCAAGAC AAAAAGACAGACAAGGCTTCAGAAGAAGTGTC 128 

II Mill MINI III Mill Mil 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 E II 

Db 161 TGTTCTGGAAAAG CAAGACATTT CTGATAAAAAAG CAGAAAAGG CTACAGAAGAAGTTTC 22 0 

Qy 12 9 TAAATCACTGCAAGCAATGAAAGAAATTCTGTGTGGTAOW^ 188 

III III II III 1 1 II 1 1 II III 1 1 III Mill II Mill II I II 

Db 221 CAAAAATCTGGTTGCCATGAAAGAAATTCTGTATGG 2 80 

Qy 18 9 AGAAGCAGTGGCTCAGCTAGCACAAGAACTCTACAGCAGTGGCCTC 248 

III II III I Mill II II II MUM III I Mill II II MUM 

Db 281 AGAAGCAGTAGCTCAACTTGCTCAAGAACTCTATAATAGTGGGCTCCTTAGCACCCTGGT 340 

Qy 24 9 AGCTGACCTG(^GCTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATATTTAACAA 308 

llllll I Mill II III III III II II MM II III I II II II Mill 

Db 341 AGCTGATTTACAGCTCATTGACTTTC^ 4 00 

Qy 3 09 CAT CTTGAGAAGACAGATAGG CACT CGGAGT CCTACTGTGGAGTATATTAGTGCTCATCC 368 

M I Ml, M II II II IIIIIIMI II II II I I II I 

Db 4 01 TATTCT(^GAAGACAAATTGGTACGAGAACTCCTACTGTTGAATA(^TCTGCACCC^(^ 4 60 

Qy 36 9 TCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTACGTTGTGG 428 

MM MM III I I Mill llllll I III I II II II I II I II 

Db 461 GAATATTTTGTTCATGTTATTGAAAGGGTATGAATCTCCAGAAATAGCTCTAAATTGTGG 52 0 

Qy 42 9 GATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTCTTTTCTAA 488 

II III I I II III II II II II III Mill I II I Ml I I II I 

Db 521 AATAATGTTAAGAGAATGC^T(^GA(^TGAAC(^CTTGCAAAAATCATTTTGTGGTCGGA 58 0 

Qy 48 9 TCAATTC?VGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCTTCAGATGC 548 

II II llllll II I II II II I IIIIIIMI Ml II Mill II III I 

Db 581 ACAGTTTTATGATTTCTTCAGATATGTCGAAATGTCAACATTTGACATAGCTTCAGATGC 64 0 

Qy 54 9 CTTTGCTACTTTOUVGGATTTACTAAC(^GA(^TAAAGTGTTGGTAGC^GACTTCTTAGA 608 

Mill M llllllllllllll II IMIMIII II I Mill II I! II 

Db 641 ATTTGCCACATTCAAGGATTTACTTACAAGACATAAATTGCTCAGTGCAGAATTTTTGGA 700 



Qy 

Db 

Qy 
Db 

Qy 
Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 
Db 



609 



701 



ACAAAATTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAGAATTATGT 

III 1 1 1 1 II I 1 1 1 1 II MINIM II Mill II II MINIM 

ACAGCATTATGATAGATTTTTCAGTGAATATGAGAAGTTACTTCATTCAGAAAATTATGT 



669 TACTAAGAGACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCACAACTTTGC 

I! II MINIM I Mill II II II II I II II I 1 1 1 1 1 1 1 1 I 

761 GACAAAAAGACAGTCACTGAAGCTTCTCGGTGAACTACTACTAGATAGACACAACTTCAC 
72 9 CAT CATGACAAAGTATATCAGCAAG CCGGAGAACCTGAAACTCATGATGAAC CT CCTT CG 

II II II Mill II II llllllll III I 1 1 1 1 1 1 1 1 1 E I II II 

821 AATTATGACAAAATACATCAGTAAACCTGAGAACCTCAAATTAATGATGAACCTGCTGCG 



789 



881 



849 



941 



GGATAAAAGTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTTGTGGCCAG 

II MINN NMMMNIMM llllllll I INI Ml 1 1 1 II I II 1 1 1 1 

AGACAAAAGTCGCAACATCCAGTTTGAGGCCTTTCACGTTTTTAAGGTGTTTGTAGCCAA 



668 



760 



728 



820 



788 



880 



848 



940 



908 



TCCTCACAAAACACAGCCTATTGTGGAGATCCTGTTAAAAAATCAGCCCAAACTCATTGA 

INI INI II Mill II I II INN I II II III 1 1 1 1 1 1 1 1 1 1 II 

TCCTAACAAGACGCA.GCCCATCCTAGACATCCTCCTCAAGAACCAGGCCAAACTCATAGA 1000 



909 



968 



GTTT CTGAG CAG CTT CCAAAAAGAAAGGACGGATGATGAG CAGTTCG CTGACGAGAAGAA 

Ml II MM II II II II llllllll MIMMIMI llllllll 

1001 GTTCCTCAGCAAGTTTCAGAACGACAGGACGGAGGATGAGCAGTTTAACGACGAGAAGAC 1060 
96 9 CTACTTGATTAAACAGAT CCGAGACTTGAAGAAAACGG CCC 1009 

Ml M I M I II 1 1 1 1 1 I II IMIIII I I II I 

1061 CTATTTAGTTAAACAGATCAGGGATTTGAAGAGACCAGCTC 1101 



RESULT 13 

BC020570 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



REMARK 
COMMENT 



BC020570 3761 bp mRNA linear PRI 22-JAN-2002 

Homo sapiens, M025 protein, clone MGC:21631 IMAGE : 43 97573 , mRNA, 
complete cds . 
BC020570 

BC02 0570 .1 GI : 18 08 8260 
MGC . 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 3761) 
St rausberg , R . 
Direct Submission 

Submitted ( 03 - JAN-2002 ) National Institutes of Health, Mammalian 
Gene Collection (MGC) , Cancer Genomics Office, National Cancer 
Institute, 31 Center Drive, Room 11A03, Bethesda, MD 20892-2590, 
USA 

NIH-MGC Project URL: http://mgc.nci.nih.gov 
Contact: MGC help desk 
Email: cgapbs-r@mail.nih.gov 
Tissue Procurement: ATCC 

cDNA Library Preparation: Life Technologies, Inc. 

cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 

DNA Sequencing by: Sequencing Group at the Stanford Human Genome 

Center, Stanford University School of Medicine, Stanford, CA 94305 

Web site: http://www-shgc.stanford.edu 



Contact: (Dickson, Mark) mcd@paxil.stanford.edu 

Dickson, M . , Schmutz, J., Grimwood, J., Rodriquez, A., and Myers, 
R. M. 



FEATURES 

source 



CDS 



BASE COUNT 
ORIGIN 



Clone distribution: MGC clone distribution information can be found 
through the I.M.A.G.E. Consort ium/LLNL at: http://image.llnl.gov 
Series: IRAK Plate: 27 Row: d Column: 16 

This clone was selected for full length sequencing because it 
passed the following selection criteria: Hexamer frequency ORF 
analysis, GenomeScan gene prediction. 

Location/Qualifiers 

1. .3761 

/organism="Homo sapiens" 

/mol_type="mRNA n 

/db_xre f = " Locus I D : 5 1 7 1 9 " 

/db_xref="taxon: 9606" 

/clone="MGC: 21631 IMAGE : 4397573 " 

/ 1 i s sue_type= " Duodenum , adenocarc inoma " 

/ c 1 one_l ib= " NI H_MGC_8 8 " 

/lab_host="DH10B" 

/note=" Vector: pCMV-SP0RT6" 

325. .1350 

/ codon_start=l 

/product="M025 protein" 

/protein_id="AAH20570. 1" 

/db_xref ="GI : 18088261" 

/trans la t ion= " MPFPFGKSHKS PADI VKNLKESMAVLEKQDI SDKKAEKATEEVS 
KNLVAMKEI LYGTNEKEPQTEAVAQLAQELYNSGLLSTLVADLQLI DFEGKKDVAQI F 
NN I LRRQ I GTRTPTVE Y I CTQQN I LFMLLKG YES PE I ALNCG I MLREC I RHEPLAKI I 
LWSEQFYDFFRYVEMSTFDIASDAFATFKDLLTRHKLLSAEFLEQHYDRFFSEYEKLL 
HSENYVTKRQSLKLLGELLLDRHNFTIMTKYISKPENLKLMMNLLRDKSRNIQFEAFH 
VFKVFVANPNKTQPILDILLKNQAKLIEFLSKFQNDRTEDEQFNDEKTYLVKQIRDLK 
RPAQQEA" 

1171 a 709 c 808 g 1073 t 



Query Match 57.5%; 
Best Local Similarity 74.7%; 
Matches 748; Conservative 



Score 582.6; DB 9 
Pred. No. 4.3e-132 
0; Mismatches 244 



Length 3761; 

Indels 9; Gaps 1; 



Qy 

Db 



18 GTTTAGTAAAT CACACAAAAATCCAGCAGAAATTGTGAAAATC CTGAAAGACAATTTGG C 77 

1 1 1 1 I II II MINI MINIM! MINIM I INN II I Nil 

336 GTTTGGGAAGTCTCACAAATCTCCAGCAGACATTGTGAAGAATCTGAAGGAGAGCATGGC 395 



QY 
Db 

Qy 

Db 

Qy 
Db 

Qy 



78 CATTTTGGAAAAG CAAGAC AAAAAGACAGACAAGGCTTCAGAAGAAGTGTC 128 

II I M II II I INN INI WWW 1 1 1 1 1 1 1 1 1 1 N 

3 96 TGTTCTGGAAAAGCAAGACATTTCTGATAAAAAAGCA.GAAAAGGCTAC?\GAAGAAGTTTC 4 55 
12 9 TAAATCACTGCAAGCAATGAAAGAAATTCTGTGTGGTACAAAC 188 

III III II I M 1 1 II I II 1 1 1 1 III Mill II 1 1 1 1 1 II I II 

4 56 CAAAAATCTGGTTGCCATGAAAGAAATTCTGTATC^ 515 
18 9 AGAAGCAGTGGCTCAGCTAGC^(^GAACTCTAC^GCA.GTGGCCTGCTAGTGACACTGAT 248 

IIIIIMII Mill II II lllllllllll I Mill II II II III I 

516 AGAAG CAGTAG CTCAACTTG CT CAAGAACT CTATAATAGTGGG CT C CTTAG CACCCTGGT 575 

24 9 AG CTGAC CTG CAG CTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATATTTAACAA 3 08 



1 1 1 1 1 1 I Mill II lllllllllll IIIIMII III I II II M Mill 

Db 576 AGCTGATTTACAGCTCATTGACTTTGAGGGCAAAAAAGACGTGGCTCAAATTTTCAAO^ 635 

Qy 3 09 CAT CTTGAGAAGACAGATAGG CACTCGGAGTC CTACTGTGG AGTATATTAGTG CTCATC C 3 68 

II I IMIIIII II II II II MM Mill M M II I I II I _ 

Db 63 6 TATTCTCAGAAGACAAATTGGTACGAGAACTCCTACTGTTGAATACATCTGCACCCAACA 695 

Oy 3 69 TCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTACGTTGTGG 428 

IMI MM Ml I I Mill MMM I III I M M II MMM 

696 GAATATTTTGTTCATGTTATTGAAAGGGTATGAATCTCCAGAAATAGCTCTAAATTGTGG 755 
Qy 42 9 GATTATGCTGAGAGAATGTATT CGACATGAAC CACTTGCCAAAATCATCCT CTTTT CTAA 48 8 

II Ml I IIIIIIII II llllllllllllllll IMIIIII I I II I 

Db 756 AATAATGTTAAGAGAATGCATCAGAC^TGAAC^ 815 

Qy 489 TCAATTC^GAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCTTCAGATGC 548 

II II IMIIIII I M II II MMIMIMM II lllllllllll 

Db 816 ACAGTTTTATGATTTCTTCAGATATGTCGAAATGTCAACATTTGACATAGCTTCAGATGC 875 

Qy 549 CTTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGACTTCTTAGA 608 

1 1 II I II III lllllllllll II I IIIIIIII 11 I Mill II II II 

Db 876 ATTTGCCACATTCAAGGATTTACTTACAAGACATAAATTGCTCAGTGCAGAATTTTTGGA 935 

Qy 609 ACAAAATTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAGAATTATGT 668 

Ml MM II I MM II IIIIIIII II MMI II II IIIIIIII 

Db 936 ACAGCATTATGATAGATTTTTCAGTGAATATGAGAAGTTACTTCATTCAGAAAATTATGT 995 

Qy 669 TACTAAGAGACAGT CTTTAAAG CTG CTAGGGGAG CTGATC CTGGAC CGTCACAACTTTG C 728 

II M IIIIIIII I Mill II II II II I II II I MMMM I 

Db 996 GACAAAAAGACAGTCACTGAAGCTTCTCGGTGAACTACTACTAGATAGACACAACTTCAC 1055 

Qy 72 9 CATCATGACAAAGTATATCAGCAAGCCGGAGAACCTGA 788 

II IMIIIII II Mill M M IIIIIIII III I lllllllllll II M 

Db 1056 AATTATGACAAAATACATCAGTAAACCTGAGAACCTCAAATTAATGATGAACCTGCTGCG 1115 

Qy 78 9 GGATAAAAGTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTTGTGGCCAG 848 

II II I IMI I MM II MM I III IIIIMII I III II IIIIMII Ml MM 

Db 1116 AGACAAAAGTCGCAAC^TCCAGTTTGAGGCCTTTCACGTTTTTAAGGTGTTTGTAGCCAA 1175 

Qy 84 9 TCCTCACAAAACA(IAGCCTATTGTGGAG 908 

MM IMI II Mill M I II MMI I II 1 1 II I II II II II 1 1 II 

Db 1176 TCCTAAC^GACG<^GCCC^TCCT^^ 1235 

Qy 909 GTTTCTGAGCAGCTTCCAAAAAGAAAGGACGGATGATGAGCAGTTCGCTGACGAGAAGAA 968 

Ml II MM II II II II MMMM II MMM III IIIIIMIII 

Db 1236 GTTCCTCAGCAAGTTTCAGAACGACAGGACGGAGGATGAGCAGTTTAACGACGAGAAGAC 12 95 

Qy 96 9 CTACTTGATTAAACAGATCCGAGACTTGAAGAAAACGGCCC 1009 

III II MMM Mill I II lllllll I I II I 

Db 12 96 CTATTTAGTTAAACAGATCAGGGATTTGAAGAGACCAGCTC 1336 

RESULT 14 
AF151824 

LOCUS AF151824 1680 bp mRNA linear PRI 18-MAY-2000 

DEFINITION Homo sapiens CGI -66 protein mRNA, complete cds . 
ACCESSION AF151824 



VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 



CDS 



BASE COUNT 
ORIGIN 



AF151824.1 GI :4929600 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 1680) 

Lai,C.H., Chou,C.Y., Ch'ang,L.Y., Liu f C.S. and Lin,W. 

Identification of novel human genes evolutionarily conserved in 

Caenorhabditis elegans by comparative proteomics 

Genome Res. 10 (5), 703-713 (2000) 

20272150 

10810093 

2 (bases 1 to 1680) 
Lin,W. -C. 

Direct Submission 

Submitted ( 17 -MAY-1999) Institute of Biomedical Sciences, Academia 
Sinica, No. 128, Sec. II, Academia Road, Taipei 115, Taiwan 
Location/Qualifiers 
1. .1680 

/ organ ism=" Homo sapiens" 
/mol_type="mRNA" 
/db_xref="taxon: 9606" 
1. .1026 
/codon_start=l 
/product= n CGI-66 protein" 
/protein_id="AAD34061 .1" 
/db_xref="GI : 4929601" 

/trans la t ion= " MPFPFGKSHKS PADI VKNLKESMAVLEKQDI SDKKAEKATEEVS 
KNLVAMKEI LYGTNEKEPQTEAVAQLAQELYNSGLLSTLVADLQLI DFEGKKDVAQI F 
NNILRRQIGTRTPTVEYICTQQNILFMLLKGYESPEIALNCGIMLRECIRHEPLAKI I 
LWSEQFYDFFRYVEMSTFDIASDAFATFKDLLTRHKLLSAEFLEQHYDRFFSEYEKLL 
HSENYVTKRQSLKLLGELLLDRHNFTIMTKYISKPENLKLMMNLLRDKSRNIQFEAFH 
VFKVFVANPNKTQPI LDI LLKNQAKLI EFLSKFQNDRTEDEQFNDEKTYLVKQIRDLK 

RPAQQEA" 

540 a 324 c 358 g 458 t 



Query Match 57.4%; 
Best Local Similarity 74.7%; 
Matches 747; Conservative 



Score 581.6; DB 9 
Pred. No. 7.5e-132 
0; Mismatches 244 



Length 1680; 

Indels 9; Gaps 1; 



Qy 


19 


TTTAGTAAATCA-CACAAAAATCCAGCAGAAAT^ 

Ml 1 II II MINI MINIMI MINN 1 Mill M 1 MM 

TTTGGGAAGT CT CACAAATCTC CAG CAGACATTGTGAAGAATCTGAAGGAGAG CATGG CT 


78 


Db 


13 


72 


Qy 


79 


ATTTTGGAAAAGCAAGAC --- AAAAAGACAGACAAGG CTTCAGAAGAAGTGTCT 

M MIMIMIMMI Mill MM MMM MIIIIMM II 

GTT CTGGAAAAG CAAGACATTTCTGATAAAAAAG CAGAAAAGG CTACAGAAGAAGTTTCC 


129 


Db 


73 


132 


Qy 


130 


AAATCACTGCAAGCAATGAAAGAAATTCTGTGTGGTAC^ 

Ml III II IIIIIIIIMIMIII III Mill M IMM II 1 III 

AAAAAT CTGGTTG CCATGAAAGAAATTCTGTATGG CACAAATGAAAAAGAGCCTCAGACA 


189 


Db 


133 


192 


Qy 


190 


GAAG CAGTGGCTCAGCTAGCACAAGAACT CTACAG CAGTGG CCTG CTAGTGACACTGATA 

MIMIII Mill II II IMM 1 Mill II II II III II 

GAAGCAGTAGCTCAACTTGCTCAAGAACTCTATAATAGTGGGCTCCTTAGCACCCTGGTA 


249 


Db 


193 


252 



Qy 250 GCTGACCTGCAGCTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATATTTAACAAC 3 09 

HIM I 1 1 1 1 1 II lllllllllll MINIM Ml I M II II Mill 

Db 253 GCTGATTTACAGCTCATTGACTTTGAGGGCAAAAAAGACGTGGCTCAAATTTTCAACAAT 312 

Qy 310 ATCTTGAGAAGACAGATAGG CACTCGGAGTCCTACTGTGGAGTATATTAGTG CT CAT C CT 369 

II I IIIIMM II II II II IMIIIIM II II II I MM 
Db 313 ATTCTCAGAAGACAAATTGGTACGAGAACTCCTACTGTTGAATACATCTGCACCCAACAG 372 

Qy 370 CATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTACGTTGTGGG 429 

Ml MM Ml I I HIM HIM I III I II II II MUM 

Db 373 AATATTTTGTTCATGTTATTGAAAGGGTATGAATCTCCAGAAATAGCTCTAAATTGTGGA 432 

Qy 430 ATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTCTTTTCTAAT 489 

II Ml I IIIIMM II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 IIIIMM I I II I 

Db 4 33 ATAATGTTAAGAGAATGCATCAGACATGAACCACTTGCAAAAATCATTTTGTGGTCGGAA 4 92 

Qy 4 90 CAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCTTCAGATGCC 54 9 

II II II Ml Ml I MM II III II II I II Ml M MINIUM 

Db 4 93 CAGTTTTATGATTTCTTCAGATATGTCGAAATGTO^CATTTGACATAGCTTGA,GATGC^ 552 

Qy 550 TTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGACTTCTTAGAA 609 

MM II lllllllllll III II I Mil II I II I MM II MM 

Db 553 TTTGCCACATTCAAGGATTTACTTACAAGACATAAATTGCTCAGTGCAGAATTTTTGGAA 612 

Qy 610 CAAAATTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAGAATTATGTT 669 

1 1 Ml II I Ml II MINN II MM II II 

Db 613 CAGCATTATGATAGATTTTTCAGTGAATATGAGAAGTTACTTCATTCAGAAAATTATGTG 672 

Qy 670 ACTAAGAGACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCACAACTTTGCC 72 9 

II II MINN I MM II II II II I II II I MINN I 

Db 673 ACAAAAAGACAGTCACTGAAGCTTCTCGGTGAACTACTACTAGATAGACACAACTTCACA 732 

Qy 730 AT CATGACAAAGTATATCAG CAAG C CGGAGAACCTGAAACT CATGATGAACCTC CTTCGG 78 9 

II IIIIMM II MMI II II IIIIMM III I III IIIIMM II II 

Db 733 ATTATGACAAAATACATCAGTAAACCTGAGAAC CT CAAATTAATGATGAAC CTGCTG CGA 792 

Qy 7 90 GATAAAAGTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTTGTGGCCAGT 84 9 

II MMI III I MM Ml I MINN 1 1 III 1 1 1 1 1 1 1 1 III I Ml I 

Db 793 GACAAAAGTCGCAACATCCAGTTTGAGGCCTTTCACGTTTTTAAGGTGTTTGTAGCCAAT 852 

Qy 850 CCTCACAAAACACAGCCTATTGTGGAGATCCTGTTAAAAAATCAGCCCAAACTCATTGAG 909 

III Ml N Mil II I II INN I II II III MINIM III 

Db 853 CCTAACAAGACGCAGCCCATCCTAGACATCCTCCTCAAGAACCAGGCCAAACTCATAGAG 912 

Qy 910 TTTCTGAG CAG CTT C CAAAAAGAAAGGACGGATGATGAG CAGTT CGCTGACGAGAAGAAC 96 9 

II II MM II II II II IIIIMM lllllllllll 1 1 II I II MM 

Db 913 TTCCTCAGCAAGTTTCAGAACGACAGGACGGAGGATGAGCAGTTTAACGACGAGAAGACC 972 

Qy 97 0 TACTTGATTAAACAGATCCGAGACTTGAAGAAAACGGCCC 100 9 

II II II Mill MM I II II Ml II I I M I 

Db 973 TATTTAGTTAAACAGAT CAGGGATTTGAAGAGACCAG CT C 1012 



RESULT 15 

AF113536 

LOCUS 



AF113536 



3466 bp mRNA linear PRI 04-DEC-1999 



DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
REFERENCE 
AUTHORS 

TITLE 
JOURNAL 



FEATURES 

source 



CDS 



Homo sapiens M025 protein mRNA, complete cds . 
AF113536 

AF113536.1 GI:6523826 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 3466) 

Jin,W., Shi, J., Ren,S., Gu,J. f Fu,S. f Huang, Q., Dong,H., Yu,Y. , 

Fu,G., Wang,Y., Chen,Z. and Han,Z. 

A novel gene expressed in the human hypothalamus 

Unpublished 

2 (bases 1 to 3466) 

Jin,W., Shi, J., Ren,S., Gu,J., Fu,S., Huang, Q., Dong,H., Yu,Y. , 
Fu f G. f Wang,Y., Chen,Z. and Han,Z. 
Direct Submission 

Submitted (16-DEC-1998) Chinese National Human Genome Center at 
Shanghai, 351 Guo Shouj ing Rd. , Zhang j iang Hi-Tech Park, Pudong, 
Shanghai 201203, China 

Location/Qualifiers 
1. .3466 

/organism="Homo sapiens" 
/ mo 1 _ t yp e = 11 mRNA " 
/db_xref- M taxon: 9606" 
/ 1 i s sue_type= 11 hypotha 1 amus 11 
54. .1079 
/codon_start=l 
/product ="M02 5 protein" 
/protein_id="AAF14873 . 1" 
/db xref="GI : 6523827" 

/ 1 rans lation=" MPFPFGKSHKS PAD I VKNLKESMAVLEKQDI SDKKAEKATEEVS 
KNLVAMKEILYGTNEKEPQTEAVAQLAQELYNSGLLSTLVADLQLIDFEGKKDVAQIF 
NNI LRRQI GTRTPTVEYI CTQQNI LFMLLKGYES PEI ALNCG I MLRECI RHEPLAKI I 
LWSEQFYDFFRYVEMSTFDIASDAFATFKDLLTRHKLLSAEFLEQHYDRFFSEYEKLL 
HSENYVTKRQSLKLLGELLLDRHNFTIMTKYISKPENLKLMMNLLRDKSRNIQFEAFH 
VFKVFVANPNKTQP I LDI LLKNQAKLI EFLSKFQNDRTEDEQFNDEKTYLVKQI RDLK 



BASE COUNT 
ORIGIN 



RPAQQEA" 
1101 a 606 c 



689 g 1070 t 



Query Match 57.4%; 
Best Local Similarity 74.7%; 
Matches 747; Conservative 



Score 581.6; DB 9 
Pred. No. 7.6e-132 
0; Mismatches 244 



Length 34 66; 
Indels 9; Gaps 



Qy 

Db 

Qy 
Db 

Qy 
Db 



i; 



79 



126 



130 



186 



78 



1 9 TTTAGTAAAT(^CACAAAAATC(^GCAGAAATTGTGAAAATCCTGAAAGACAATTTGGCC 

Ml I II II MINI MINIMI MINIM I INN II I INI 

66 TTTGGAAAGTCTCACAAAT CT C CAG CAGACATTGTGAAGAAT CTGAAGGAGAG CATGGCT 125 



ATTTTGGAAAAGCAAGAC AAAAAGACAGACAAGG CTTCAGAAGAAGTGTCT 129 

M i 1 1 1 1 1 1 ! I ! M II INN INI WWW MINIUM II 

GTTCTGGAAAAGCAAGACATTTCTGATAAAAAAG CAGAAAAGG CTACAGAAGAAGTTTC C 
AAATCACTGGAAGCAATGAAAGAAATTCTGTGTGGTACAAACGAGAAAGi^ 

Nl III 1 1 1 1 1 1 1 1 1 M 1 1 M M I IN INN II INN II I III 

AAAAAT CTGGTTG CCATGAAAGAAATTCTGTATGG CACAAATGAAAAAGAG CCT CAGACA 



185 



189 



245 



190 GAAGCAGTGGCTCAGCTAGCACAAGAACTCT^ 24 9 

llllllll Mill II II IIMIIIIIII I Mill II II M III M 

24 6 GAAGCAGTAGCTCAACTTGCTCAAGAACTCTATAATAGTGGGCTCCTTAGCACCCTGGTA 305 
250 GCTGACCTGCAGCTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATATTTAACAAC 3 09 

Mill I Mill II 1 1 1 1 1 1 1 1 1 1 1 MMMM III I II 1 1 M Mill 

3 06 GCTGATTTACAGCTCATTGACTTTGAGGGC 365 

310 ATCTTGAGAAGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGTGCTCATCCT 369 

I | I MMMM II II II II IIMIIMI II II II I I II I 

3 66 ATTCTCAGAAGACAAATTGGTACGAGAACTCCTACT 425 

37 0 CATATCCTGTTTATGCTCCT(IAAAGGATATGAAGCCCCACAGATTGCCTTACGTTGTGGG 42 9 

MM MM III I I IMII MUM I Ml I II II II MUM 

42 6 AATATTTTGTTCATGTTATTGAAAGGGTATGAATCTCCAGAAATAGCTCTAAATTGTGGA 4 85 

43 0 ATTATGCTGAGAGAATGTATTCGACATC 489 

M III I MMMM II I II I II I M II II 1 1 1 MMMM II II I 

486 ATAATGTTAAGAGAATGCATCAGACATC 545 

4 90 CAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCTTCAGATGCC 54 9 

II II MMMM I II II II I III Mill I Ml II III MUM II 

54 6 OVGTTTTATGATTTCTTCAGATATGTCGAAATGTCAACATTTGACATAGCTTCAGATGCA 605 

55 0 TTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGACTTCTTAGAA 60 9 

Mill M IMIMIIIIIMI M IIMIIMI M I Mill II II III 

6 06 TTTG CCACATTCAAGGATTTACTTACAAGACATAAATTG CT CAGTG CAGAATTTTTGGAA 665 
610 CAAAATTACGACACTATTTTTGAAGACTATGAGAAATTG CTTCAGT CTGAGAATTATGTT 669 

II MM II I MM II MMMM II Mill M II MMMM 

666 CAGCATTATGATAGATTTTTCAGTGAATATGAGAAGTTACTTCATTCAGAAAATTATGTG 72 5 

670 ACTAAGAGACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCACAACTTTGCC 72 9 

|| II llllllll MUM II II II II Mill lllllllll l 

726 ACAAAAAGACAGTCACTGAAGCTTCTCGGTGAACTACTACTAGATAGACACAACTTCACA 785 

73 0 ATCATGACAAAGTATATCAGCAAGCCGGAGAACCTGAAACTCATGATGAACCTCCTTCGG 78 9 

M MMMM M Mill II II MMMM III I IMIIMMM II II 

786 ATTATGACAAAATACATCAGTAAACCTGAGAA C CT CAAATTAATGATGAAC CTGCTG CGA 84 5 
790 GATAAAAGTCCCAACATCCAGTTTGAAGCC 84 9 

II lllllll lllllllllllllll llllllll MMMM MM I 

84 6 GACAAAAGTCGCAACATCCAGTTTGAGGCCTTTCACGTTTTTAAGGTGTTTGTAGCCAAT 905 
850 CCTCACAAAACACAGCCTATTGTGGAGATCCTC 909 

Ml MM II Mill II I M Mill I II II III IIIIIIMII Ml 

906 CCTAACAAGACGCAGCCCATCCTAGACATCCT 965 
910 TTTCTGAG CAGCTTCCAAAAAGAAAGGACGGATGATGAG CAGTTCG CTGACGAGAAGAAC 969 

MMMM II II II II MMMM IIMIIIIIII llllllllll I 

966 TT CCT CAG CAAGTTT CAGAACGACAGGACGGAGGATGAGCAGTTTAACGACGAGAAGACC 1025 
97 0 TACTTGATTAAACAGATCCGAGACTTGAAGAAAAlCGGCCC 1009 

II II IIMIIIIIII I II lllllll I I II I 

1026 TATTTAGTTAAACAGATCAGGGATTTGAAGAGACCAGCTC 1065 



Search completed: January 6, 2004, 02:34:57 
Job time : 3971 sees 
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Run on: January 6, 2 004, 00:31:27 ; Search time 3 90 Seconds 

(without alignments) 
7018.539 Million cell updates/sec 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



US-10-088-872-1 
1014 

1 atgaaaaaaatgcctttgtt tgaagaaaacggccccttga 1014 

I DENT I T Y_NUC 

Gapop 10.0 , Gapext 1.0 



Searched: 



2552756 seqs, 1349719017 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post -processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



5105512 



Database : N_Geneseq__19Jun03 : * 

1 : /SIDSl/gcgdata/geneseq/geneseqn-embl/NA1980 . DAT: * 
2 : / SIDSl/gcgdata/ geneseq/geneseqn-embl/NA198 1 . DAT : * 
3 : /SIDSl/gcgdata/geneseq/geneseqn-embl/NA1982 . DAT: * 
4 : /SIDSl/gcgdata/geneseq/ geneseqn-embl/NA1983 . DAT : * 
5 : /SIDSl/gcgdata/geneseq/geneseqn-embl/NA1984 . DAT : * 
6 : /SIDSl/gcgdata/geneseq/geneseqn-embl/NA1985 .DAT: * 
7 : /SIDSl/gcgdata/geneseq/geneseqn-embl/NA1986 . DAT : * 
8 : /SIDSl/gcgdata/geneseq/geneseqn-embl/NA1987 . DAT: * 
9 : /SIDSl/gcgdata/geneseq/geneseqn-embl/NA1988 . DAT : * 
10 : /SIDSl/gcgdata/geneseq/ geneseqn-embl/NA198 9 . DAT : * 
11 : /SIDSl/gcgdata/geneseq/geneseqn-embl/NA1990 .DAT: * 
12 : /SIDSl/gcgdata/geneseq/ geneseqn-embl/NA1991 . DAT : * 
13 : /SIDSl/gcgdata/geneseq/geneseqn-embl/NA1992 .DAT: * 
14 : / SIDSl/gcgdata/geneseq/ geneseqn-embl/NA1993 . DAT : * 
15 : /SIDSl/gcgdata/geneseq/geneseqn-embl/NA1994 . DAT: * 
16 : /SIDSl/gcgdata/geneseq/geneseqn-embl/NA1995 . DAT : * 
17 : /SIDSl/gcgdata/geneseq/geneseqn-embl/NA1996 .DAT: * 
18 : / SIDSl/gcgdata/geneseq/ geneseqn-embl/NA1997 . DAT : * 
19 : /SIDSl/gcgdata/geneseq/ geneseqn-embl/NA19 98 . DAT: * 
20 : /SIDSl/gcgdata/geneseq/geneseqn-embl/NA1999 .DAT: * 
21 : /SIDSl/gcgdata/geneseq/ geneseqn-embl/NA2000 . DAT : * 
22 : /SIDSl/gcgdata/geneseq/geneseqn-embl/NA2001A.DAT: * 
23 : /SIDSl/gcgdata/ geneseq/ geneseqn-embl/NA2001B . DAT : * 
24 : /SIDSl/gcgdata/geneseq/geneseqn-embl/NA20 02 .DAT: * 
25 : /SIDSl/gcgdata/geneseq/geneseqn-embl/NA2003 .DAT: * 



Pred No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 



Result 
No. 



Score 



Query 

Match Length DB 



ID 



Description 
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1014 


100.0 


1014 


22 


AAF86462 




2 


1014 


100 . 0 


1421 


22 


AAI58234 




3 


1010. 8 


99.7 


1344 


21 


AAA27332 




4 


992 . 8 


97 .9 


2002 


22 


AAH15879 




5 


770 . 6 


76 .0 


822 


22 


AAH05471 




6 


684 . 6 


67 .5 


831 


20 


AAX39817 


c 


7 


684.4 


67 .5 


1191 


22 


AAI60020 




8 


582.6 


57.5 


1026 


22 


AAC91772 




9 


582 . 6 


57.5 


3281 


24 


ABK13127 




10 


582 . 6 


57 .5 


3849 


23 


ABV22987 




11 


582.6 


57 .5 


3849 


23 


ABV28822 




12 


541.6 


53 .4 


1053 


22 


AAF30688 




13 


539.6 


53 .2 


1162 


23 


AAS89557 


c 


14 


520.2 


51 .3 


833 


20 


AAX39818 




15 


496 


48 . 9 


2492 


23 


AAS88031 




16 


387.8 


38 .2 


722 


20 


AAZ15133 




17 


362.8 


35 . 8 


2231 


23 


ABL07151 




18 


362.8 


35.8 


4231 


23 


ABL07150 




19 


288 . 8 


28 . 5 


690 


24 


ABS77084 




20 


246 .4 


24.3 


435 


24 


ABL82285 




21 


244 .8 


24 . 1 


447 


24 


ABL82921 




22 


244 . 8 


24 . 1 


450 


24 


ABL81975 




23 


210 .8 


20.8 


762 


24 


ABS76784 




24 


210.4 


20.7 


1474 


21 


AAC32983 




25 


208.8 


20.6 


1497 


21 


AAC40181 




26 


200 .2 


19. 7 


918 


21 


AAC42766 




27 


200 .2 


19.7 


1032 


21 


AAC48253 


c 


28 


195 


19.2 


387 


24 


ABN93983 


c 


29 


195 


19.2 


387 


24 


ABL66143 




30 


169 .8 


16.7 


722 


24 


AAS61992 




31 


166 .6 


16.4 


700 


24 


AAS61993 




32 


163 .8 


16.2 


481 


25 


ABZ19574 




33 


163 .4 


16.1 


300 


20 


AAZ14552 




34 


161.2 


15. 9 


1515 


21 


AAC50415 




35 


156 


15.4 


861 


24 


ABN98824 


c 


36 


153 .4 


15. 1 


737 


23 


AAS79449 




37 


147 .2 


14.5 


464 


21 


AAC4 6721 




38 


133 .2 


13 . 1 


615 


22 


AAH07116 




39 


107 . 6 


10.6 


1149 


23 


AAS88030 




40 


107.6 


10.6 


3279 


23 


AAS89559 




41 


65.6 


6.5 


432 


24 


ABN78107 




42 


65 


6.4 


487 


22 


AAI98879 




43 


65 


6.4 


487 


22 


AAI64066 




44 


53.6 


5.3 


254 


25 


ABX31310 




45 


43 


4.2 


447 


21 


AAC06449 



Human Acute Neuron 
Human polynucleoti 
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Human cDNA sequenc 
Human cDNA clone ( 
Gastric cancer ass 
Human polynucleoti 
Human ANIC-BP (acu 
Human secretory po 
Human prostate exp 
Human prostate exp 
Human acute neuron 
DNA encoding novel 
Gastric cancer ass 
DNA encoding novel 
Human gene express 
Drosophila melanog 
Drosophila melanog 
Frog embryonic gen 
Human ovarian cane 
Human ovarian cane 
Human ovarian cane 
Frog embryonic gen 
Arab i dop sis thalia 
Arabidopsis thalia 
Arabidopsis thalia 
Arabidopsis thalia 
Gene #481 used to 
Lung cancer relate 
Porcine muscular s 
Porcine muscular s 
Group III cDNA can 
Human gene express 
Arabidopsis thalia 
Arabidops is thai ia 
DNA encoding novel 
Zea mays DNA fragm 
Human cDNA clone ( 
DNA encoding novel 
DNA encoding novel 
Human ORF3054 cDNA 
Human excretory re 
Human bladder rela 
Human GDP-mannose 
Human secreted pro 



ALIGNMENTS 



RESULT 1 
AAF86462 

ID AAF86462 standard; cDNA; 1014 BP. 
XX 

AC AAF86462; 
XX 

DT 26-JUN-2001 (first entry) 
XX 

DE Human Acute Neuronal Induced Calcium Binding Protein, ANIC-BP, cDNA. 
XX 

KW Human; cerebroprotective; neuroprotective; vulnerary; vaccine; 

KW gene therapy; Acute Neuronal Induced Calcium Binding Protein; ANIC-BP; 

KW stroke; acute head trauma; multiple sclerosis; spinal cord injury; ss. 

XX 

OS Homo sapiens . 
XX 

FH Key Location/Qualifiers 
FT CDS 1 • -1014 

FT 
FT 

FX Protein, ANIC-BP" 

XX 

PN WO200123552-A1 . 
XX 

PD 05-APR-2001. 
XX 

PF 18-SEP-2000; 2000WO-EP09132 . 
XX 

PR 24-SEP-1999; 99EP-0118848 . 
XX 

PA (MERE ) MERCK PATENT GMBH. 
XX 

PI Den Daas I , Duecker K; 
XX 

DR WPI; 2001-308142/32. 
DR P-PSDB; AAB82090. 
XX 



/*tag= a 

/product^ "Human Acute Neuronal Induced Calcium Binding 



PT Novel human acute neuronal induced calcium binding polypeptide, and 

PT polynucleotides encoding them useful for diagnosing or treating stroke, 

PT acute head trauma, multiple sclerosis and spinal cord injury - 
XX 

PS Claim 5; Page 40-41; 45pp; English. 
XX 

CC The present sequence is the coding sequence for human Acute Neuronal 

CC Induced Calcium Binding Protein (ANIC-BP) . ANIC-BP coding sequence and 

CC protein are useful for treating stroke, acute head trauma, multiple 

CC sclerosis and spinal cord injury. ANIC-BP coding sequence and protein 

CC are also useful as vaccines for inducing an immunological response in a 

CC mammal . 
XX 

SQ Sequence 1014 BP; 340 A; 205 C; 209 G; 260 T; 0 other; 

Query Match 100.0%; Score 1014; DB 22; Length 1014; 

Best Local Similarity 100.0%; Pred. No. 3.5e-272; 



Db 

Qy 

Db 



Matches 1014; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

1 ATGAAAAAAATGCCTTTGTTTAGTAAATCACACAAAAATCCAGC^GAAATTGTGAAAATC 60 

, 1 ATCAAAAAAATGCCTTTGTTT^ 60 

61 CTGAAAGACAATTTGGCCATTTTGGAAAAGCAAGACAAAAAGACAGACAAGGCTTCAGAA 12 0 

3 61 CTGAAAGACAATTTGGCCATTTTGGAAAAGCAAGACAAAAAGACAGACAAGGCTTCAGAA 120 

( 121 GAAGTGTCTAAATCACTGCAAGCAA.TGAAA.GAAATTCTGTGTGGTACAAACGAGAAAGAA. 180 

3 121 GAAGTGTCTAAATCACTGCAAGCAATGAAAGAAATTCTGTGTGGTACAAACGAG 180 

v 181 CCCCCAACAGAAGCAGTGGCTCAGCTAGCACAAGAACTCTACAGCAGTGGCCTGCTAGTG 24 0 

0 181 CCCCCAACAGAAGCAGTGGCTCAGCTAGCACAAGAACTC^ 240 

y 241 ACACTGATAGCTGACCTGCAGCTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATA 3 00 

b 241 ACACTGATAGCTGACCTGCAGCTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATA 3 00 

y 3 01 TTTAACAACATCTTGAGAAGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGT 3 60 

b 3 01 UtAACAACATCTTGAGAAGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGT 3 60 

y 361 GCTCATCCTCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTA 420 

ib 3 61 GCTCATCCTCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTA 42 0 

,y 421 CGTTGTGGGATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCA^TCATC 48 0 

)b 421 CgUgtGGGaUaTGCT^GAGAATC 480 

)y 481 TTTTCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCT 54 0 

)b 481 TTTTCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCT 54 0 

}y 541 TCAGATGCCTTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGAC 600 

Db 541 TCAGA^GCCT^G^ 600 

}y 601 TTCTTAGAACAAAATTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAG 660 

Db 601 TTCTTAGAACAAAATTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAG 660 

2y 661 ^T^ 0 ™ 0 ^ 0 ^^^ 720 

Db 661 AATTATGtUcTAAGAGACAGTCTTTAAAGCTGCTAGGGGAGCT^^ 720 

Qy 721 AACTTTGCCATCATGACAAAGTATATCAGCAAGCCGGAGAACCTGAAACTCATGATGAAC 780 

721 AACTTTGCCATCATGACAAAGTATATCAGCAAGCCGGAGAACCTGAAACTCATGATGAAC 780 

781 CTCCTTCGGGATAA ^ GTCCCA ^^ 840 
781 CTCCTTCGGGATAaUgTCCCaI 840 



nv 841 GTGGCCAGTCCTCACAAAACACAGCCTATTGTGGAGATCCTGTTAAAAAATCAGCCCAAA 9 00 

| M 1 1 1 1 II 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 M 1 1 

Db 841 GTGGCCAGTCCTCACAAAAC^CAGCCTATTGTGGAGATCCTGTTAAAAAATCAGCCCAAA 900 

nv 901 CT CATTGAGTTTCTGAG CAG CTT CCAAAAAGAAAGGACGGATGATGAGCAGTTCG CTGAC 960 

| | | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I M I I I I I I I I I _ 

Db 901 CTCATTGAGTTTCTGAGCAGCTTCCAAAAAGAAAGGACGGATGATGAGCAGTTCGCTGAC 960 

Ov 961 GAGAAGAACTACTTGATTAAACAGATC CGAGACTTGAAGAAAACGG CC C CTTGA 1014 

7 | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 961 GAGAAGAACTACTTGATTAAACAGATCCGAGACTTGAAGAAAACGGCCCCTTGA 1014 



RESULT 2 
AAI58234 

ID AAI58234 standard; cDNA ; 1421 BP . 
XX 

AC AAI58234; 
XX 

DT 22-OCT-2001 (first entry) 
XX 

DE Human polynucleotide SEQ ID NO 437 
XX 
KW 
KW 
KW 
KW 
KW 



Human; nootropic; immunosuppressant; cytostatic; gene therapy; cancel- 
peripheral nervous system; neuropathy; central nervous system; CNS; 
Alzheimer's; Parkinson's disease; Huntington's disease; haemostatic; 
amyotrophic lateral sclerosis; Shy-Drager Syndrome; chemotactic; 
chemokinetic; thrombolytic; drug screening; arthritis; inflammation; 

KW leukaemia; ss . 
XX 

OS Homo sapiens. 
XX 

PN WO200153312-A1. 
XX 

PD 26-JUL-2001. 
XX 

PF 26-DEC-2000; 2000WO-US34263 . 
XX 

PR 21-JAN-2000; 2000US-0488725 . 

PR 25-APR-2000; 2 000US - 05523 17 . 

PR 09-JUL-2000; 2000US-0598042 . 

PR 19-JUL-2000; 2 000US - 062 03 12 . 

PR 03-AUG-2000; 2 0 00US- 06534 5 0 . 

PR 14-SEP-2000; 2 000US - 0662 1 91 . 

PR 19-OCT-2000; 2 000US - 0693 03 6 . 

PR 29-NOV-2000; 2 000US - 0727344 . 
XX 

PA (HYSE-) HYSEQ INC. 
XX 
PI 
PI 

PI Zhao QA, Zhou P, Goodrich R, Drmanac RT; 
XX 

DR WPI; 2001-442253/47. 

DR P-PSDB; AAM39078. 
XX 



Tang YT, Liu C, Asundi V, Chen R, Ma Y, Qian XB, Ren F, Wang D; 
Wang J, Wang Z, Wehrman T, Xu C, Xue AJ, Yang Y, Zhang J; 



PT Novel nucleic acids and polypeptides, useful for treating disorders 

PT such as central nervous system injuries - 

XX 

PS Claim 1; SEQ ID NO 437; 10078pp; English. 
XX 

CC The invention relates to human nucleic acids (AAI 57798 -AAI61369 ) and 

CC the encoded polypeptides (AAM38642-AAM42213) with nootropic, 

CC immunosuppressant and cytostatic activity. The polynucleotides are useful 

CC in gene therapy. A composition containing a polypeptide or polynucleotide 

CC of the invention may be used to treat diseases of the peripheral nervous 

CC system, such as peripheral nervous injuries, peripheral neuropathy and 

CC localised neuropathies and central nervous system diseases, such as 

CC Alzheimer's, Parkinson's disease, Huntington's disease, amyotrophic 

CC lateral sclerosis, and Shy-Drager Syndrome. Other uses include the 

CC utilisation of the activities such as: Immune system suppression, 

CC Activin/inhibin activity, chemotact ic/chemokinet ic activity, haemostatic 

CC and thrombolytic activity, cancer diagnosis and therapy, drug screening, 

CC assays for receptor activity, arthritis and inflammation, leukaemias and 

CC C.N.S disorders. 

CC Note: The sequence data for this patent did not form part of the printed 

CC specification. 

XX 

SQ Sequence 1421 BP; 469 A; 284 C; 306 G; 362 T; 0 other; 

Query Match 100.0%; Score 1014; DB 22; Length 1421; 

Best Local Similarity 100.0%; Pred. No. 4e-272; 

Matches 1014; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

ATGAAAAAAATGCCTTTGTTTAGTAAATCACACAAAAATCCA^ 6 0 

Ml I IMIMIMIIII MINI IIIIIMillllM MINIMUM III MINIM 

ATGAAAAAAATGCCTTTGTTTAGTAAATCACACAA^ 276 
CTGAAAGACAATTTGG C CA.TTTTGGAAAAGCAAGACAAAAAGACAGACAAGG CTTCAGAA 12 0 

1 1 1 1 1 1 1 N 1 1 N 1 1 1 1 1 N 1 1 1 1 1 1 N 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 N 1 1 N I 

CTGAAAGACAATTTGG C CATTTTGGAAAAG CAAGACAAAAAGACAGAGAAGGCTTCAGAA 336 
GAAGTGTCTAAATCACTGCAAGCAATGAAAGAAATTCTGTGTGGT 18 0 

N II 1 1 1 II 1 1 II 1 1 1 1 II 1 1 1 1 1 1 II 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 N 1 1 1 II 1 1 1 1 II 1 1 1 

GAAGTGTCTAAAT(^CTG(^GC^TGAAAGAAATTCTGTGTGGTAa^CGAGAAAG^ 3 96 
CCCCCAACAGAAGCAGTGGCTC^GC 240 

I M 1 1 M I M I M 1 1 1 1 M M M 1 1 1 M I M I M M 1 1 M I M 1 1 M 1 1 1 M 1 1 1 M 1 1 1 

CCCCCAACAGAAGCAGTGGCTCAGCTAGCACAAG^ 456 
ACACTGATAG CTGACCTG CAGCTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATA 300 

1 1 1 1 1 1 II 1 1 1 1 1 1 1 N I II II 1 1 1 1 II 1 1 1 II I N I N II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 

ACACTGATAGCTGACCTGCAGCTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATA 516 
TTTAACAACAT CTTGAGAAGACAGATAGG CACTCGGAGT C CTACTGTGGAGTATATTAGT 360 

II 1 1 II 1 1 1 1 1 1 1 II II I II 1 1 II 1 1 1 1 1 1 1 N II II 1 1 1 N II N II 1 1 1 1 1 1 1 1 1 1 1 1 

TTTAACAACATCTTGAGAAGACAGATAGG CACTCGGAGT CCTACTGTGGAGTATATTAGT 576 
GCTCATCCTt^TATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCC^CAGATTGCCTTA 42 0 

II 1 1 1 1 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 N 1 1 1 II 1 1 1 1 1 II N 

GCTCATCCTCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTA 63 6 
CGTTGTGGGATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTC 48 0 



Qy 


i 


Db 


217 


Qy 


61 


Db 


277 


Qy 


121 


Db 


337 


Qy 


181 


Db 


397 


Qy 


241 


Db 


457 


Qy 


301 


Db 


517 


Qy 


361 


Db 


577 


Qy 


421 



Db 


637 


MIIIMI IMIIIMUMIIIIIilllllllMIIMIIIIMIIIIIIIMIIIIM 

CGTTGTGGGATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTC 


696 


Qy 


481 


TTTTCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCT 

IIMIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIMMIMMM 

TTTTCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCT 


540 


Db 


697 


756 


Qy 

Db 


541 
757 


TCAGATGC CTTTG CTACTTT CAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGAC 

IMM MMM M Mi MIMI 1 II Ml 1 II 1 Mill MM MMM II 1 MMM MM 

TC^GATGCCTTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGAC 


600 
816 


Qy 


601 


TTCTTAGAACAAAATTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAG 

MM!I II! M! M 1 M 1 MMM Mill Ml 1 Mill MM II MMM MIMMMII 

TTCTTAGAACAAAATTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAG 


660 


Db 


817 


876 


Qy 


661 


AATTATGTTACTAAGAGACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCAC 

1 1 M 1 II 1 1 1 1 II 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 M 1 1 1 M 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 

AATTATGTTACTAAGAGACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCAC 


720 


Db 


877 


936 


Qy 


721 


AACTTTG CCATCATGACAAAGTATATCAG CAAG C CGGAGAACCTGAAACTCATGATGAAC 

IIMIIIMIMIIIIIIIIIIIIIIIIIIIIMIIIIMIIIIIIMIIIIIIIIIIII 

AACTTTG CCATCATGACAAAGTATATCAGCAAG C CGGAGAACCTGAAACTCATGATGAAC 


780 


Db 


937 


996 


Qy 


781 


CTCCTTCGGGATAAAAGTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTT 

IMM Ml MM 1 M, 1 1 MMM MMM 1 MMM M 1 Ml Ml M MM 1 MM IMM 

CTCCTTCGGGATAAAAGTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTT 


840 


Db 


997 


1056 


Qy 


841 


GTGGCCAGTCCTCACAAAACACAGCCTATTC 

II MM Ml Ml MINI MINI MINI MIIIMI Ml III Mill Mill III Ml 

GTGGCCAGTCCT(^CAAAACACAGCCTATTGTGGAGATCCTGTTAAAAAAT(^GCC 


900 


Db 


1057 


1116 


Qy 


am 


r Tr a ttp A r TTTPTH A f; C A GPTTC C A AAAAGAAAGGACGGATGATGAG CAGTTCG CTGAC 

ill 1 1 1 1 1 1 1 1 nlll 1 III 1 1 1 Ml 1 1 III 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 11 III 

CTCATTGAGTTTCTGAGCAGCTTCC^AAAAGAAAGGACGGATGATGAGCAGTTCGCTGAC 


960 


Db 


1117 


1176 


Qy 


961 


GAGAAGAACTACTTGATTAAACAGATCCGAGACTTGAAGAAAACGG C C CCTTGA 1014 

II II 1 II II II 1 II II II 1 II II II II II II II II 1 M II II M II II 1 II M 1 

GAGAAGAACTACTTGATTAAACAGATCCGAGACTTGAAGAAAACGGCCCCTTGA 123 0 




Db 


1177 





RESULT 3 
AAA27332 

ID AAA27332 standard; cDNA; 1344 BP. 
XX 

AC AAA27332; 
XX 

DT 10-AUG-2000 (first entry) 
XX 

DE Human calcium binding protein hCBP gene. 
XX 

KW Human; calcium binding protein; cancer; inflammation; CBP; 

KW reproductive disorder; autoimmune disorder; developmental disorder; 

KW seizure disorder; immune disorder; infection; ss . 

XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 



FT CDS 124 . . 1134 

FT /*tag= a 

FT /product = "calcium binding protein" 

XX 

PN WO200029580-A1. 
XX 

PD 25-MAY-2000. 
XX 

PF 12-NOV-1999; 99WO-US27 027 . 
XX 

PR 13-NOV-1998; 98US- 0190965 . 
XX 

PA (INCY-) INCYTE PHARM INC. 
XX 

PI Tang YT, Guegler KJ, Corley NC, Gorgone GA; 
XX 

DR WPI; 2000-387793/33. 

DR P-PSDB; AAY94247. 
XX 

PT Human hCBP protein, and the nucleic acid encoding it, useful for e.g. 

PT diagnosis, prevention and treatment of cancers, immune, developmental 

PT or reproductive disorders - 
XX 

PS Claim 9; Fig 1; 72pp; English. 
XX 

CC The present sequence is the human calcium binding protein hCBP gene. It 

CC was obtained by screening a coronary artery smooth muscle cDNA library, 

CC from which five overlapping nucleic acids were isolated and 

CC sequenced, and then expressed to give the protein. The protein and the 

CC gene encoding it are useful for the diagnosis and treatment of the 

CC following types of disorder: cancers {such as adenocarcinomas) , 

CC reproductive disorders (such as infertility, ovulatory defects, 

CC endometriosis, disruptions of the oestrus and menstrual cycles, 

CC polycystic ovary syndrome and ovarian hyperstimulation) , autoimmune 

CC disorders (such as benign prostatic hyperplasia and prostatitis), 

CC developmental disorders (such as Dashing' s syndrome, muscular dystrophy 

CC and gonadal dysgenesis), hereditary neuropathies, seizure disorders, 

CC immune disorders (such as AIDS, allergies, anaemia, asthma, 

CC atherosclerosis, cholecystitis, Crohn's disease, diabetes, Graves' 

CC disease, multiple sclerosis, psoriasis, rheumatoid arthritis, 

CC scleroderma, Sjogren's syndrome and ulcerative colitis), and viral, 

CC bacterial, fungal, parasitic, protozoal and helminthic infections. 

XX 

SQ Sequence 1344 BP; 450 A; 261 C; 280 G; 353 T; 0 other; 

Query Match 99.7%; Score 1010.8; DB 21; Length 1344; 

Best Local Similarity 99.8%; Pred. No. 3.1e-271 ; 

Matches 1012; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 



Qy 


1 


ATGAAAAAAATGCCTTTGTTTAGTAAAT(^CACAAAAATC(^GCAGAAATTGTGAAAAT^ 

M II II 1 II 1 II 1 M 1 II M MIMI Ml II III 1 M 1 IMM II Ml II III Ml III 

ATGAAAAAAATGCCTTTGTTTAGTAAATCACACAAA 


60 


Db 


124 


183 


Qy 


61 


CTGAAAGACAATTTGG CCATTTTGGAAAAGCAAGACAAAAAGAC^^ CTTCAGAA 

1 II 1 MMM 1 II MM MJ MM 1 1 1 1 1 1 1 1 M! MM 1 1 Ml 1 1 1 II 1 1 Ml II II 1 II 1 

CTGAAAGACAATTTGG C CATTTTGGAAAAG CAAGACAAAAAGACAGACAAGG CTTCAGAA 


120 


Db 


184 


243 



Qy 121 GAAGTGTCTAAATCACTGCAAGC^TGAAAGAAATTCTGTGTGGTACAAACGAGAAAGAA 18 0 

Db 244 GAAGTGTCTAAATCACTGCAAGCAATGAAAGAAATTCTGTGTGGTACAAACGAGAAAGAA 303 



181 CCCCCAACAGAAGCAGTGGCTCAGCTAGCACAAGAACTCTACAGCAGTGGCCTGCTAGTG 24 0 

304 CCCCCGACAGaU 363 
241 ACACTGATAGCTGACCTGCAGCTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATA 3 00 

364 ACACTGATAGCTGACC^ 423 
301 TTTAACAACATCTTGAGAAGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGT 360 

424 llTAACAACATcilGAGAAGAC^ 483 
361 GCTCATCCTCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTA 42 0 

4 84 GCTCATCC^CATA 543 
421 CGTTGTGGGATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTC 48 0 

544 cg!^3TCG(^T^ATCCTCAGA 603 
481 TTTTCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCT 54 0 

604 tUtctaatcaattcagagatt^ 663 

541 TCAGATGCCTTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGAC 60 0 

664 TCAGATGCCTTT^ 723 
601 TTCTTAGAACAAAATTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAG 660 

724 TTCtUgAACAAAAtUcGAC^ 783 
661 AATTATGTTACTAAGAGACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCAC 720 

784 AATTATGT^ACTAA^ 843 
721 AACTTTGCCATCATGACAAAGTATATCAGCAAGCCGGAGAACCTGAAACTCATGATGAAC 780 

844 AACtItGCCATCATGACAAAGT^^ 903 
781 CTCCTTCGGGATAAAAGTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTT 840 

904 CTCCTTCGGGAT^ 963 
841 GTGGCCAGTCCTCACAAAACACAGCCTATTGTGGAGATCCTGTTAAAAAATCAGCCCAAA 900 
964 GTGGCCAGTCCTCACAAAACACAGCCTATTGTGGAGATCCTGTTAAAAAATCAGCCCAAA 1023 
901 CTCATTGAGTTTCTGAGCAGCTTCCAAAAAGAAAGGACGGATGATGAGCAGTTCGCTGAC 960 
1024 CTCATTGAGTTTCTGAGCAGCTTC 1083 



Qy 961 GAGAAGAACTACTTGATTAAACAGATC CGAGACTTGAAGAAAACGG CCC CTTGA 1014 



Db 1084 GAGAAGAACTACTTGATTAAACAGATC CGAGACTTGAAGAAAACGG CCCCTTGA 1137 

RESULT 4 
AAH15879 

ID AAH15879 standard; cDNA; 2002 BP. 
XX 

AC AAH15879; 
XX 

DT 26-JUN-2001 (first entry) 
XX 

DE Human cDNA sequence SEQ ID NO: 14407. 
XX 

KW Human; primer; detection; diagnosis; antisense therapy; gene therapy; ss. 
XX 

OS Homo sapiens . 
XX 

PN EP1074617-A2 . 
XX 

PD 07-FEB-2001. 
XX 

PF 28-JUL-2000; 2 00 0EP- 0 11612 6 . 
XX 

PR 29-JUL-1999; 99 JP- 0248036 . 

PR 27-AUG-1999; 99JP-0300253 . 

PR ll-JAN-2000; 2000JP-0118776 . 

PR 02-MAY-2000; 2 000 JP- 0183767 . 

PR 09-JUN-2000; 2000 JP-024 18 99 . 
XX 

PA (HELI-) HELIX RES INST. 
XX 

PI Ota T, Isogai T # Nishikawa T, Hayashi K, Saito K, Yamamoto J; 

PI Ishii S, Sugiyama T, Wakamatsu A, Nagai K, Otsuki T; 

XX 

DR WPI; 2001-318749/34. 
XX 

PT Primer sets for synthesizing polynucleotides, particularly the 5602 

PT full-length cDNAs defined in the specification, and for the detection 

PT and/or diagnosis of the abnormality of the proteins encoded by the 

PT full-length cDNAs - 
XX 

PS Claim 8; SEQ ID 14407; 2537pp + CD ROM; English. 
XX 

CC The present invention describes primer sets for synthesismg 5602 

CC full-length cDNAs defined in the specification. Where a primer set 

CC comprises: (a) an oligo-dT primer and an oligonucleotide complementary 

CC to the complementary strand of a polynucleotide which comprises one of 

CC the 5602 nucleotide sequences defined in the specification, where the 

CC oligonucleotide comprises at least 15 nucleotides; or (b) a combination 

CC of an oligonucleotide comprising a sequence complementary to the 

CC complementary strand of a polynucleotide which comprises a 5 » -end 

CC sequence and an oligonucleotide comprising a sequence complementary to a 

CC polynucleotide which comprises a 3 1 -end sequence, where the 

CC oligonucleotide comprises at least 15 nucleotides and the combination of 

CC the 5 '-end sequence/3 ' -end sequence is selected from those defined in 

CC the specification. The primer sets can be used in antisense therapy and 



CC in gene therapy. The primers are useful for synthesising polynucleotides, 

CC particularly full-length cDNAs . The primers are also useful for the 

CC detection and/or diagnosis of the abnormality of the proteins encoded by 

CC the full-length cDNAs . The primers allow obtaining of the full-length 

CC cDNAs easily without any specialised methods. AAH03166 to AAH13628 and 

CC AAH13633 to AAH18742 represent human cDNA sequences; AAB92446 to 

CC AAB95893 represent human amino acid sequences; and AAH13629 to AAH13632 

CC represent oligonucleotides, all of which are used in the exemplification 

CC of the present invention. 

XX 

SQ Sequence 2002 BP; 594 A; 418 C; 463 G; 527 T; 0 Other; 

Query Match 97.9%; Score 992.8; DB 22; Length 2002; 

Best Local Similarity 99.8%; Pred. No. 3.8e-266; 

Matches 994; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 

TTTAGTAAATGA.CACAAAAATCCAGCAGAAATTG 7 8 

i II ! 1 1 M I M i 1 1 1 1 1 II I II i I II I II I Ml I !l I II 1 1 1 1 1 II! 1 1 III Mill 1 1 

TTTAGTAAATCACACAAAAAT C CAG CAGAAATTGTGAAAATC CTGAAAGACAATTTGG CC 6 0 
ATTTTGGAAAAGCAAGACAAAAAGACAGACAAGGCT 138 

1 1 1 1 1 1 1 II 1 1 M I IM 1 1 III Ml MM III 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 II II II 

ATTTTGGAAAAG CAAGACAAAAAGAC7VGACAAGG CTTCAGAAGAAGTGTCTAAAT CACTG 12 0 
CAAGCAATGAAAGAAATTCTGTGTGGTACAAACGAGAA 198 

I II I II IM I II 1 1 II II 1 1 II Ml II! II III II I II 1 1 1 M 1 1 1 1 1 1 II I II MM 

CAAG CAATGAAAGAAATTCTGTGTGGTACAAACGAGAAAGAAC C CCCAACAGAAG CAGTG 18 0 
GCTCAGCTAGCACAAGAACTCTACAGCAGTGGCCTGCTAGTGACACTGATAGCTGACCTG 2 58 

i ; 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 1 1 1 1 M 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 m 

GCTCAGCTAGCACAAGAACTCTACAGCAGTGGCCTGCTGGTGACACTGATAGCTGACCTG 24 0 
CAG CTGATAGACTTTGAGGGAAAAAAAGATGTGAC C CAGATATTTAACAACATCTTGAGA 318 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M Is I' 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 Ml 1 1 1 1 1 II M 1 1 1 1 1 

CAGCTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATATTTAACAACATCTTGAGA 3 0 0 
AGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGTGCTCATCCTCATATCCTG 378 

1 1 1 1 1 1 1 ' 1 1 1 M 1 1 M 1 1 III 1 1 1 1 ! 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1! 1 1! 1 1 1 1 1 1 1 

AGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGTGCTCATCCTCATATCCTG 360 
TTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTACGTTGTGGGATTATGCTG 4 3 8 

1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 !. 1 1 1 M 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 II I II I M 1 1 1 1 1 M i II 1 1 1 1 1 

TTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTACGTTGTGGGATTATGCTG 42 0 

AGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTCTTTTCTAATC^ 4 98 

I ! I I I! I I ; I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I 

AGAGAATGTATTCGACATGAAC(^CTTGTCAAAATCATCCTCTTTTCTAAT(^TTCAGA 48 0 

GATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCTTCAGATGCCTTTGCTACT 558 

1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 M I II Ml 1 1 ; I M 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 

GATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCTTCAGATGCCTTTGCTACT 54 0 
TTCAAGGATTTACTAAC CAGACATAAAGTGTTGGTAG CAGACTTCTTAGAACAAAATTAC 618 

II I II II! II MIMM II III IIMM II II I II I III II Mill II I II Ml M III 

TTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGACTTCTTAGAACAAAATTAC 600 



Qy 


19 


Db 


1 


Qy 


79 


Db 


61 


Qy 


139 


Db 


121 


Qy 


199 


Db 


181 


Qy 


259 


Db 


241 


Qy 


319 


Db 


301 


Qy 


379 


Db 


361 


Qy 


439 


Db 


421 


Qy 


499 


Db 


481 


Qy 


559 


Db 


541 



Qy 



619 GACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAGAATTATGTTACTAAGAGA 678 



Db 


601 


Qy 


679 


Db 


661 


Qy 


739 


Db 


721 


Qy 


799 


Db 


781 


Qy 


859 


Db 


841 


Qy 


919 


Db 


901 


Qy 


979 


Db 


961 



1 1 1 1 1 1 1 M i II 1 1 1 1 M 1 1 ! II 1 1 M I M 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 

GACACTATTTTTGAAGACTATGAGAAATTG CTT CAGTCTGAGAATTATGTTACTAAGAGA 660 
CAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCACAACTTTGCCATCATGACA 73 8 

llllllllllllllllllllllllllllllllllllllllllllllllllllllllllll 

CAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCACAACTTTGCCATCATGACA 720 



ilMMII IIIIIMIiillll MIIMIIIIMIIIIIIIIMM IIIMIMIII! 



CCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTTGTGGCCAGTCCTCACAAA 858 

1 1 1 1 II I i I MM 1 1 II I MM! 1 1 'I I : MM 1 1 1 ill 1 1 1 1 1 1 1 1 III II 1 1 II 1 1 II 

CCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTTGTGGCCAGTCCTCACAAA 84 0 
ACACAGCCTATTGTGGAGATCCTGTTAAAAAATCAGCCCAAACTCATTGAGTTTCTGAGC 918 

I III I II I III II M M M II I III M II I II 1 1 1 II I II I Ml II M I II I II III I 

A(^CAGCCTATTGTGGAGATCCTGTTAAAAAATCAGCCCAAACTCATTGAGTTTCTGAGC 900 
AGCTTCCAAAAAGAAAGGACGGATGATGAGCAGTTCGCTGACGAGAAGAACTACTTGATT 978 

1 1 1 1 1 1 M I ! 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 !! 1 1 1 1 1 1 1 1 1 1 1 1 1! 1 1 1 1 1 1 1 1 

AG CTT C CAAAAAGAAAGGACGGATGATGAG CAGTTCG CTGACGAGAAGAACTACTTGATT 960 

AAACAGAT CCGAGACTTGAAGAAAACGG CCCCTTGA 1014 

I I M I I I I I I I i I I I I I I I I I I M I I I I I I I I 

AAACAGATCCGAGACTTGAAGAAAACGGCCCCTTGA 996 



RESULT 5 
AAH05471 

ID AAH05471 standard; CDNA; 822 BP. 
XX 

AC AAH05471; 
XX 

DT 26-JUN-2001 (first entry) 
XX 

DE Human cDNA clone (5' -primer) SEQ ID NO: 23 06. 
XX 

KW Human; primer; detection; diagnosis; antisense therapy; gene therapy; ss . 
XX 

OS Homo sapiens . 
XX 

PN EP1074617-A2 . 
XX 

PD 07-FEB-2001 . 
XX 

PF 28-JUL-2000; 2000EP-0116126 . 
XX 

PR 29-JUL-1999; 99 JP- 024 8 03 6 . 
PR 27-AUG-1999; 9 9 JP- 03 0 0253 . 
PR ll-JAN-2000; 2000JP-0118776 . 
PR 02-MAY-2000; 2000JP- 0183767 . 
PR 09-JUN-2000; 2000 JP- 024 18 9 9 . 
XX 

PA (HELI-) HELIX RES INST. 
XX 

PI Ota T, Isogai T, Nishikawa T, Hayashi K, Saito K, Yamamoto J; 



PI Ishii S, Sugiyama T, Wakamatsu A, Nagai K, Otsuki T; 
XX 

DR WPI; 2001-318749/34. 
XX 

PT Primer sets for synthesizing polynucleotides, particularly the 5602 

PT full-length cDNAs defined in the specification, and for the detection 

PT and/or diagnosis of the abnormality of the proteins encoded by the 

PT full-length cDNAs - 
XX 

PS Claim 1; SEQ ID 2306; 2537pp + CD ROM; English. 
XX 

CC The present invention describes primer sets for synthesising 5602 

CC full-length cDNAs defined in the specification. Where a primer set 

CC comprises: (a) an oligo-dT primer and an oligonucleotide complementary 

CC to the complementary strand of a polynucleotide which comprises one of 

CC the 5602 nucleotide sequences defined in the specification, where the 

CC oligonucleotide comprises at least 15 nucleotides; or (b) a combination 

CC of an oligonucleotide comprising a sequence complementary to the 

CC complementary strand of a polynucleotide which comprises a 5 '-end 

CC sequence and an oligonucleotide comprising a sequence complementary to a 

CC polynucleotide which comprises a 3 ' -end sequence, where the 

CC oligonucleotide comprises at least 15 nucleotides and the combination of 

CC the 5' -end sequence/3 1 -end sequence is selected from those defined in 

CC the specification. The primer sets can be used in antisense therapy and 

CC in gene therapy. The primers are useful for synthesising polynucleotides, 

CC particularly full-length cDNAs . The primers are also useful for the 

CC detection and/or diagnosis of the abnormality of the proteins encoded by 

CC the full-length cDNAs . The primers allow obtaining of the full-length 

CC cDNAs easily without any specialised methods. AAH03166 to AAH13628 and 

CC AAH13633 to AAH18742 represent human cDNA sequences; AAB92446 to 

CC AAB958 93 represent human amino acid sequences; and AAH13629 to AAH13632 

CC represent oligonucleotides, all of which are used in the exemplification 

CC of the present invention. 

XX 

SQ Sequence 822 BP; 268 A; 164 C; 171 G; 216 T; 3 other; 

Query Match 76.0%; Score 770.6; DB 22; Length 822; 

Best Local Similarity 98.5%; Pred. No. 2.1e-204; 

Matches 798; Conservative 0; Mismatches 10; Indels 2; Gaps 2; 
QY 19 TTTAGTAAATCACACAAAAATCCAGCAGAAATT 78 



Db 




Qy 



79 ATTTTGGAAAAG CAAGACAAAAAGACAGA CAAGGCTT CAGAAGAAGTGTCTAAAT CACTG 138 



Db 




Qy 



13 9 CAAGCAATGAAAGAAATTCTGTGTGGTACAAACGAGAAAGAACCCC 198 



Db 




Qy 



199 GCTCAGCTAGCACAAGAACTCTACAGCAGTGGCCTGCTAGTGACACTGATAGCTGACCTG 258 



Db 




Qy 



25 9 CAGCTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATATTTAACAACATCTTGAGA 318 



Db 


241 


Qy 


319 


Db 


301 


Qy 


379 


Db 


361 


Qy 


439 


Db 


421 


Qy 


499 


Db 


481 


Qy 


559 


Db 


541 


Qy 


619 


Db 


601 


Qy 


679 


Db 


661 


Qy 


739 


Db 


721 


Qy 


799 


Db 


779 



1 1 1 M 1 1 1 1 1 1 1 1 1 ! I II 1 1 1 III 1 1 1 II I M 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 II 1 1 1 II 1 1 1 1 1 

CAGCTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATATTTAACAACATCTTGAGA 300 
AGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGTGCTCATCCTCATATCCTG 378 

I II 1 1 1 1 1 1 M Ml 1 1! 1 1 1 1 II Ml 1 1 1 1 1 1 1 IN 1 1 1 Mill II! 1 1 1 il Ml (II I 

AGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGTGCTCATCCTCATATCCTG 360 
TTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTACGTTGTGGGATTATGCTG 438 

Ml II 1 1 1 1 1 1 1 1 II I IM 1 1 II Ml MM II 1 1 1 II 1 1 1 II I M I II II III MUM I 

TTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTACGTTGTGGGATTATGCTG 420 
AGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTCTTTTCTAATCAATTCAGA 4 98 

I II 1 1 1 MM I Ml M M 1 1 1 II II 1 1 1 1 MINIM III III Mill II I II III II 

AGAGAATGTATTCGACATGAACCACTTGTCAAAATCATCCTCTTTTCTAATCAATTCAGA 48 0 
GATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCTTCAGATGCCTTTGCTACT 558 

I II M 1 1 1 1 1 1 1 1 Mill 1 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 IMIMI I II III MINI I 

GATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCTTCAGATGCCTTTGCTACT 54 0 
TTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGACTTCTTAGAACAAAATTAC 618 

111 III MM Ml I MM III MM I MINI MM IMIMI MM Ml MM MM || 

TTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGACTTCTTAGAACAAAATTAC 600 
GACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAGAATTATGTTACTAAGAGA 678 

M 1 1 1 1 Ml 1 1 1 1 II Ml II I MMM II 1 1 1 1 1 III 1 1 1 II III III 1 1 M I II I M II 

GACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAGAATTATGTTACTAAGAGA 660 
CAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCACAACTTTGCCATCATGACA 73 8 

M 1 1 II I II I II I II II I II 1 1 II 1 1 II 1 1 M II 1 1 1 1 1 1 1 II II 1 1 1 M 1 1 M II II 1 1 

CAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCACAACTTTGCCATCATGACA 720 



M 1 1 M 1 1 1 M M M M I M M M M I MIMMMMMMI MM! Mill 

AAGTATATCAGCAAGCCGGAGAACCTG - AACTCATGATGAACCTNCTTCGGGAT - AAAGT 
CCCAACATCCAGTTTGAAGCCTTTCATGTT 828 

MMMMMMMMI III III 



798 



RESULT 6 
AAX39817 

ID AAX39817 standard; DNA; 831 BP. 
XX 

AC AAX39817; 
XX 

DT 02-JUL-1999 (first entry) 
XX 

DE Gastric cancer associated gene. 
XX 

KW Cancer associated antigen; diagnosis; research; treatment; human; 

KW breast cancer; colon cancer; gastric cancer; renal cancer; lung cancer; 

KW prostate cancer; ss. 

XX 

OS Homo sapiens. 
XX 

PN WO9904265-A2 . 



PD 


28 


- JAN- 


1999 






XX 












PF 


15 


-JUL- 


1998; 98WO 




XX 












PR 


22 


-JUN- 


1998 


98US 


-0102322 


PR 


17 


-JUL- 


1997 


97US 


-0896164 


PR 


10 


-OCT- 


1997 


97US 


-0061599 


PR 


10 


-OCT- 


1997 


97US 


-0061765 


PR 


10 


-OCT- 


1997 


97US 


-0948705 


PR 


11 


-OCT- 


1997 


97GB 


-0021697 



XX 

PA (LUDW-) LUDWIG INST CANCER RES . 
XX 

PI Chen Y, Gout I, Gure A, O'Hare M, Obata Y, Old LJ; 

PI Pfreundschuh M, Sahin u, Scanlan MJ, Stockert E; 

PI Tureci O; 
XX 

DR WPI; 1999-132448/11. 
XX 

PT New isolated cancer associated nucleic acids and polypeptides - 

PT isolated using sera from cancer patients, used to develop products 

PT for the diagnosis, monitoring or treatment of cancers 
XX 

PS Claim 67; Page 558-559; 787pp ; English. 
XX 

CC The invention relates to a method for diagnosing a disorder characterised 

CC by expression of a human cancer associated antigen precursor coded for by 

CC a nucleic acid molecule (NAM) . The method comprises: (a) contacting a 

CC biological sample isolated from a subject with an agent that specifically 

CC binds to the NAM, an expression product or a fragment of an expression 

CC product complexed with an HLA molecule; and (b) determining the 

CC interaction between the agent and the NAM or the expression product as a 

CC determination of the disorder. The products and methods can be used in 

CC the diagnosis, monitoring, research, or treatment of conditions 

CC characterised by the expression of various cancer associated antigens. 

CC The invention provides nucleic acid sequences and encoded polypeptides 

CC which are cancer associated antigen precursors expressed in human breast 

CC cancer, renal cancer, colon cancer, gastric cancer, prostate cancer and 

CC lung cancer. 

XX 

SQ Sequence 831 BP; 285 A; 165 C; 167 G; 209 T; 5 other; 



Query Match 67.5%; Score 684.6; DB 20; Length 831; 

Best Local Similarity 96.1%; Pred . No. 1.9e-180; 

Matches 764; Conservative 0; Mismatches 23; Indels 8; Gaps 6; 
Qy 1 ATGAAAAAAATGCCTTTGTTTAGTAAATCACACAAA^ 60 

MM III MM Mil I Mill Mill III I llilll III Mill III III MM MM I 

Db 37 ATGAAAAAAATGCCTTTGTTTAGTAAATCACACAAAAATCC^ 96 

Qy 61 CTGAAAGACAATTTGG CCATTTTGGAAAAG CAAGACAAAAAGACAGACAAGG CTTCAGAA 12 0 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 ) 1 1 1 1 i I 

Db 97 CTGAAAGACAATTTGGCCATTTTGGAAAAGCAAGACAAAAAGACAGA 156 



Qy 121 GAAGTGTCTAAAT(^CTGCAAGCAATGAAAGAAATTCTGTGTGGTACAAACGAGAAAGAA 18 0 

I II I M II I M I M M 1 1 II II M II 1 1 II I II II 1 1 II II II II I II II M I II I II II 



Db 



157 GAAGTGTCTAAATCACTGCAAGC^ 216 



Qy 


181 


Db 


217 


Qy 


241 


Db 


277 


Qy 


301 


Db 


337 


Qy 


361 


Db 


397 


Qy 


421 


Db 


457 


Qy 


481 


Db 


517 


Qy 


541 


Db 


577 


Qy 


599 


Db 


637 


Qy 


659 


UiJ 


by/ 


Qy 


717 


Db 


757 


Qy 


773 


Db 


817 



CCCCC^ACAGAAGCAGTGGCTCAGCTAGCACAAGAACTCTACAGCAGTGGCCTGCTAGTG 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1! 1 1 1 1 1 M I 1 1 1 1 1 1 1 1 1 1 1 1 1 II I 1 1 1 1 1 ! 1 1 1 1 ! II 

CCCCCAACAGAAGCAGTGGCTCAGCTAGCACAAGAACTCT^ 

ACACTGATAGCTGACCTGCAGCTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATA 

1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1! 1 1 ! 1 1 II 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 II 1 1 1 1 1 M 1 1 II 

ACACTGATAG CTGACCTG CAG CTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATA 



1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 ! i II 1 1 : 1 1 II 1 1 1 1 1 i i 1 1 M 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 



GCTCATCCTCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTA 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 II I ! 1 1 1 1 M I II 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 i 

GCTCATCCTCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTA 



1 1 1 1 1 1 1 1 1 ! 1 1 1 1 II 1 1 1 1 1 1 1 1 1'l 1 1 M 1 1 1 1 1 M 1 1 i 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 



TTTTCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCT 

Mi M I M 1 1 1 1 1 1 1: II 1 1 1 1 II I MM 1 1 II 1 1 1 1 1 1 1 1 II II i 1 1 Ml M I Mill 

TTTTCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCT 



1 1 1 1 Ml IN 1 1 1 II 1 1 1 1 1 1 1 I M III II 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 II Ml 1 1 



ACTTCTTAGAACAAAATTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTG 

MMMMMMMMMMMMMI M M M M M M M M M M M M M M M M 

ACTTCTTAGAACAAAATTACGACACTANTTTTGAAGACTATGAGAAATTGCTTCAGTCTG 



! MMMMMI MMMMMI 



I MMM 1 1 



Ml MMMM 



TCACAACTTTGCCATC-ATGACAAAGTATATCAGCAAGCC GGAGAACCTGAAACTCA 

Ml MMMMMM I I Mill Mill MMM III III! I 

TCANAACTTTGCCATCAANGCAAAAGTTTAT 



787 



240 



276 



300 



336 



420 



456 



540 



576 



658 



696 



772 



816 



TGATGAACCTCCTTC 

II MMMMMI 

GGAGGAACCTCCTTC 831 



RESULT 7 
AAI60020/C 

ID AAI60020 standard; cDNA; 1191 BP. 
XX 

AC AAI60020; 
XX 

DT 22-OCT-2001 (first entry) 
XX 

DE Human polynucleotide SEQ ID NO 4009. 
XX 



KW Human; nootropic; immunosuppressant; cytostatic; gene therapy; cancer; 

KW peripheral nervous system; neuropathy; central nervous system; CNS; 

KW Alzheimer's; Parkinson's disease; Huntington's disease; haemostatic; 

KW amyotrophic lateral sclerosis; Shy-Drager Syndrome; chemotactic; 

KW chemokinetic; thrombolytic; drug screening; arthritis; inflammation; 

KW leukaemia; ss. 

XX 

OS Homo sapiens. 
XX 

PN WO200153312-A1. 
XX 



PD 


26 


-JUL- 


2001. 






XX 












PF 


26 


-DEC- 


2000; 


2000WO- 


US34263 


XX 












PR 


21 


-JAN- 


2000; 


2000US- 


0488725 


PR 


25 


-APR- 


2000; 


2000US- 


0552317 


PR 


09 


-JUL- 


2000; 


2000US- 


0598042 


PR 


19 


-JUL- 


2000; 


2000US- 


0620312 


PR 


03 


-AUG- 


2000; 


2000US- 


0653450 


PR 


14 


-SEP- 


2000; 


2000US- 


0662191 


PR 


19 


-OCT- 


2000; 


2000US- 


0693036 


PR 


29 


-NOV- 


2000; 


2000US- 


0727344 



XX 

PA (HYSE-) HYSEQ INC. 
XX 

PI Tang YT, Liu C, Asundi V, Chen R, Ma Y, Qian XB, Ren F , Wang D; 

PI Wang J, Wang Z, Wehrman T, Xu C, Xue AJ, Yang Y, Zhang J; 

PI Zhao QA, Zhou P, Goodrich R, Drmanac RT; 
XX 

DR WPI; 2001-442253/47. 

DR P-PSDB; AAM40864. 
XX 

PT Novel nucleic acids and polypeptides, useful for treating disorders 

PT such as central nervous system injuries - 

XX 

PS Claim 1; SEQ ID NO 4009; 10078pp ; English. 
XX 

CC The invention relates to human nucleic acids (AAI 57798 -AAI61369 ) and 

CC the encoded polypeptides (AAM3 8 642-AAM42213) with nootropic, 

CC immunosuppressant and cytostatic activity. The polynucleotides are useful 

CC in gene therapy. A composition containing a polypeptide or polynucleotide 

CC of the invention may be used to treat diseases of the peripheral nervous 

CC system, such as peripheral nervous injuries, peripheral neuropathy and 

CC localised neuropathies and central nervous system diseases, such as 

CC Alzheimer's, Parkinson's disease, Huntington's disease, amyotrophic 

CC lateral sclerosis, and Shy-Drager Syndrome. Other uses include the 

CC utilisation of the activities such as: Immune system suppression, 

CC Activin/inhibin activity, chemotact ic/chemokinetic activity, haemostatic 

CC and thrombolytic activity, cancer diagnosis and therapy, drug screening, 

CC assays for receptor activity, arthritis and inflammation, leukaemias and 

CC C.N.S disorders. 

CC Note: The sequence data for this patent did not form part of the printed 

CC specification. 

XX 

SQ Sequence 1191 BP; 348 A; 261 C; 236 G; 346 T; 0 other; 



Query Match 67.5%; Score 684.4; DB 22; Length 1191; 

Best Local Similarity 99.9%; Pred. No. 2.6e-180; 

Matches 685; Conservative 0; Mismatches 1; Indels 0; Gaps 



0; 



Qy 


329 


Db 


1189 


Qy 


389 


Db 


1129 


Qy 


449 


Db 


1069 


Qy 


509 


Db 


1009 


Qy 


569 


Db 


949 


Qy 


629 


Db 


889 


Qy 


689 


Db 


829 


Qy 


749 


Db 


769 


Qy 


809 


Db 


709 


wy 


ft £ Q 

O D J 


Db 


649 


Qy 


929 


Db 


589 


Qy 


989 


Db 


529 



32 9 GCACTCGGAGTCCTACTGTGGAGTATATTAGTGCTCATCCTCATATCCTGTTTATGCTCC 388 

MINI! 1 1 1 1 M I II 1 1 1 ill 1 1 1 II 1 1 1 1 III III I II Ml 1 1 II 1 1 1 1 1 1 1 1 1 1 1 ! 

GCACTCGAAGTCCTACTGTGGAGTATATTAGTGCTCATCCTCATATCCTGTTTATGCTCC 1130 
TCAAAGGATATGAAGCCCCACAGATTGCCTTACGTTGTGGGATTATGCTGAGAGAATGTA 44 8 

1 1 1 1 1 1 1 : 1 I I !l ! 1 1 I I I 1 1 1 1 1 I 1 1 I I I I 1 1 1 1 1 1 i 1 1 I II I I ! I I 1 1 ! 1 1 1 1 1 1 1 1 

TCAAAGGATATGAAGCCCCACAGATTGCCTTACGTTGTGGGATTATGCTGAGAGAATGTA 1070 
TTCGACATGAACCACTTGCCAAAATCATCCTCTTTTCTAATCAATTCAGAGATTTCTTTA 508 

1 1 1 1 II 1 1 1 ill 1 1 1 M I Ml II ! 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 II II I II II I III 

TTCGACATGAACCACTTGCCAAAATCATCCTCTTTTCTAATCAATTCAGAGATTTCTTTA 1010 
AGTACGTGGAGTTGTCAACATTTGATATTGCTTCAGATGCCTTTGCTACTTTCAAGGATT 568 

1 1 1 1 M Ml MM 1 1 M 1 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 III M I MM I M I 

AGTACGTGGAGTTGT CAACATTTGATATTG CTT CAGATG C CTTTG CTA CTTTCAAGGATT 950 
TACTAACCAGACATAAAGTGTTGGTAGCAGACTTCTTAGAACAAAATTACGACACTATTT 628 

MMMMMM MMMMMMMMMMMMM MMMMMMIMMMMI 

TACTAACCAGACATAAAGTGTTGGTAGCAGACTTCTTAGAACAAAATTACGACACTATTT 8 90 
TTGAAGACTATGAGAAATTG CTTCAGTCTGAGAATTATGTTACTAAGAGACAGT CTTTAA 688 

1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 M M I Mill 1 1 1 1 1 II 1 1 1 1 M 1 1 II 1 1 1 Ml II I II II I II I 

TTGAAGACTATGAGAAATTG CTTCAGTCTGAGAATTATGTTACTAAGAGACAGT CTTTAA 830 
AGCTGCTAGGGGAGCTGATCCTGGACCGTCACAACTTTGCCA.TCATGACAAAGTATATC^ 74 8 

1 1 1 1 1 1 1 1 :! 1 1 1 1 ,! I M I II M 1 1 1 1 1 II 1 1 1 1 1 M 1 1 1 1 1 1 II I II I II il I il 

AGCTGCTAGGGGAGCTGATCCTGGACCGTC^C^CTTTGCCATCATGACAAAGTATATCA 770 
GCAAGCCGGAGAACCTGAAACTCATGATGAACCTCCTTCGGGATAAAAGTCCCAACATCC 8 08 

I M I M I Ml 1 1 M I M 1 1 1 II M 1 1 II 1 1 1 1 IM Ml 1 1 1 Ml 1 1 II 1 1 M II MM 1 1 

GCAAGCCGGAGAACCTGAAACTCATGATGAACCTCCTTCGGGATAAAAGTCCCAACATCC 710 
AGTTTGAAGCCTTTCATGTTTTTAAGGTGTTTGTGGCCAGTCCTCA 868 

MM MMMMM MMMMMMMMMMMMM MM MM MMMMIM I 

AGTTTGAAGCCTTTC^TGTTTTTAAGGTGTTTGTGGCCAGTCCTCACAAAACACAGCCTA 650 
TTGTGGAGATCCTGTTAAAAAATCAGCCCAAACTCATTGAGTTTCTGAGCAGCTTCCAAA 928 

1 1 1 II Ml 1 1 1 1 1 1 II 1 1 1 III 1 1 1 II 1 1 1 1 lh Ml 1 1! Ml 1 1 II 1 1 1 ; II MM I 

TTGTGGAGATCCTGTTAAAAAATC^GCCCAAACTCATTGAGTTTCTGAGCAGCTTCQVAA 5 90 



MMMMMM MMMMMMMMMMMMM MMMMM I II MM MM I 

AAGAAAGGACGGATGATGAGCAGTTCGCTGACGAGAAGAACTACTTGATTAAACAGATCC 
GAGACTTGAAGAAAACGG CC CCTTGA 1014 

MMMMMMIMMMMMMI 



RESULT 8 
AAC91772 

ID AAC91772 Standard; CDNA; 1026 BP. 
XX 



AC AAC91772; 
XX 

DT 27-MAR-2001 {first entry) 
XX 

DE Human ANIC-BP (acute neuronal induced calcium-binding protein) cDNA. 
XX 

KW Human; acute neuronal induced calcium-binding protein; ANIC-BP; 

KW Mo25 homologue; HymA homologue; drug screening; stroke; 

KW acute head trauma; multiple sclerosis; spinal cord injury; vaccine ; 

KW cerebroprotective; neuroprotective; ss. 

XX 

OS Homo sapiens. 
XX 

PN WO200078947-A1. 
XX 

PD 28-DEC-2000. 
XX 

PF 14-JUN-2000; 2000WO-EP05457 . 
XX 

PR 22-JUN-1999; 9 9EP- 0 1 12 024 . 
XX 

PA (MERE ) MERCK PATENT GMBH. 
XX 

PI Den Daas I, Fischer V, Seyfried C, Von Melchner L; 
XX 

DR WPI; 2001-102721/11. 

DR P-PSDB; AAB48970. 
XX 

PT Novel acute neuronal induced calcium binding protein, useful for 

PT treating acute head trauma, stroke, multiple sclerosis and spinal cord 

PT injury 

XX 

PS Claim 5; Page 35-36; 50pp ; English. 
XX 

CC The invention relates to human acute neuronal induced calcium-binding 

CC protein (ANIC-BP) and to nucleic acid encoding it. The invention 

CC also relates to expression systems and recombinant host cells comprising 

CC ANIC-BP DNA, the recombinant production of ANIC-BP, antibodies specific 

CC for ANIC-BP, fusion proteins comprising ANIC-BP and an immunoglobulin 

CC Fc region, and methods of screening for modulators of ANIC-BP function. 

CC ANIC-BP has homology and structural similarity to HymA and Mo25 proteins. 

CC ANIC-BP proteins and nucleotides are useful for treating stroke and 

CC acute head trauma, multiple sclerosis and spinal cord injury. ANIC-BP 

CC proteins are useful in screening assays, for identifying membrane bound 

CC or soluble receptors, and also in vaccines. ANIC-BP nucleotides are 

CC useful as diagnostic reagents, as tools for tissue expression studies, 

CC for chromosome localisation studies, as genetic vaccines, and in 

CC the generation of transgenic animals. The present sequence represents 

CC cDNA encoding human ANIC-BP. 

XX 

SQ Sequence 1026 BP; 359 A; 199 C; 203 G; 265 T; 0 other; 

Query Match 57.5%; Score 582.6; DB 22; Length 1026; 

Best Local Similarity 74.7%; Pred. No. 5.5e-152; 

Matches 748; Conservative 0; Mismatches 244; Indels 9; Gaps 1; 



Qy 



18 GTTTAGTAAATCACACAAAAATCCAGCAGAAATTGTGAAA 77 



Db 


12 


Qy 


78 


Db 


72 


Qy 


129 


Db 


132 


Qy 


189 


Db 


192 


Qy 


249 


Db 


252 


Qy 


309 


Db 


312 


Qy 


369 


Db 


372 


Qy 


429 


Db 


432 


Qy 


489 


Db 


492 


Qy 


549 


Db 


552 


Qy 


609 


Db 


612 


Qy 


669 


Db 


672 


Qy 


729 


Db 


732 


Qy 


789 


Db 


792 


Qy 


849 



1 1 II I II II Mill lllllllll llllllll I Mill II I I 



CATTTTGGAAAAGCAAGAC AAAAAGACAGACAAGGCTTCAGAAGAAGTGTC 128 

II I I MM Mill MM llllll IIIMIIIII M 

TGTT CTGGAAAAG CAAGACATTTCTGATAAAAAAG CAGAAAAGG CTACAGAAGAAGTTTC 131 
TAAATCACTGCAAGCAATGAAAGAAATTCTGTGTGGT^ 188 

III Ml II llllllllllllllll III Mill II Illll II I M 

CAAAAATCTGGTTGC(^TGAAAGAAATTCTGTATGGC^CAAATGAAAAAGAGCCTC^GAC 191 
AGAAGC^GTGGCTCAGCTAGCACAAGAACTCTACAGCAGTGGCCTGCTAGTGACACTGAT 248 

lllllllll Mill II II MIMIIMM I Illll II M II III I 

AGAAGCAGTAGCTCAACTTGCTCAAGAACTCTATAATAGTGGGCTCCTTAGCACCCTGGT 251 



llllll I Mill M MIMIIMM llllllll Ml I II II II Mill 



CAT CTTGAGAAGACAGATAGG CACTCGGAGT C CTACTGTGGAGTATATTAGTG CTCATCC 3 68 

II I llllllll II II II II lllllllll II II II I Ml I 

TATTCTCAGAAGACAAATTGGTACGAGAACTCCTACTGTTGAATACATCTGCACCCAACA 371 
TCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTACGTTGTGG 428 

MM MM Ml I I Illll llllll I III I II II II llllll 

GAATATTTTGTTCATGTTATTGAAAGGGTATGAATCTCCAGAAATAGCTCTAAATTGTGG 431 
GATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTCTTTTCTAA 4 88 

II III I llllllll II llllllllllllllll llllllll II M I 

AATAATGTTAAGAGAATGCATCAlGACATGAACCACTTGCAAAAATCATTTTGTGGTCGGA 4 91 

TCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCTTCAGATGC 548 

II II llllllll I II II II IIIIIMIIIIII II lllllllllll 

ACAGTTTTATGATTTCTTCAGATATGTCGAAATGTCAACATTTGACATAGCTTCAGATGC 551 

CTTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGACTTCTTAGA 608 

MM II I MM II lllllllll II I Illll II II II 

ATTTGCCACATTCAAGGATTTACTTAGAAGACATAAATTGCTCAGTGCAGAATTTTTGGA 611 
ACAAAATTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAGAATTATGT 668 

III Ml M I Ml II MIMIII II Illll II II llllllll 

ACAGCATTATGATAGATTTTTCAGTGAATATGAGAAGTTACTTCATTCAGAAAATTATGT 671 
TACTAAGAGACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCACAACTTTGC 728 

II II llllllll I Illll II II M M Mill I llllllll I 

GACAAAAAGACAGTCACTGAAGCTTCTCGGTGAACTACTACTAGATAGACAOUVCTTCAC 731 
CATCATGACAAAGTATATCAGCAAGCCGGAGAACCTGAAACTCATGATGAACCTCCTTCG 788 

II llllllll II Illll II II llllllll Ml I lllllllllll II II 

AATTATGACAAAATACATCAGTAAACCTGAGAACCTCAAATTAATGATGAACCTGCTGCG 791 



I IIIIMI MMMMIMI llllllll lllllllllll Ml 



84 9 TCCTCACAAAACACAGCCTATTGTGGAGATCCTGTT^ 908 

Ml MM II Illll II I M MM I II II III MIMIII II 



Db 



8 52 TCCTAACAAGACGCAGCCCATC^ 911 



Qy 



9 09 GTTTCTGAGCAG CTTCCAAAAAGAAAGGACGGATGATGAG CAGTTCG CTGACGAGAAGAA 968 



Db 




1 1 1 1 E 1 1 1 1 1 

3ACGAGAAGAC 971 



Qy 



969 CTACTTGATTAAACAGATCCGAGACTTGAAGAAAACGGCCC 1009 



Db 




RESULT 9 
ABK13127 

ID ABK13127 standard; cDNA; 3281 BP. 
XX 

AC ABK13127; 
XX 

DT 09-APR-2002 (first entry) 
XX 

DE Human secretory polynucleotide (sptm) cDNA (481257 .'3). 
XX 

KW Signal peptide; transmembrane domain; human; sptm; ss; gene; 

KW 481257.3; ant iarteriosclerot ic ; ant iatherosclerotic ; antipsoriatic; 

KW antiinflammatory; cytostatic; anti-HIV; antiallergic; antidiabetic; 

KW nephrotropic; antigout; antithyroid; hepatotropic ; neuroprotective; 

KW osteopathic; antirheumatic; ant iarthritic; dermatological ; cancer; 

KW immunosuppressive; antiulcer; ophthalmological ; vulnerary; gout; 

KW anticonvulsant; cerebroprotective; nootropic; antiparkinsonian; 

KW virucide; antibacterial; cell proliferative disorder; arteriosclerosis; 

KW atherosclerosis; psoriasis; immune system disorder; inflammation; 

KW acquired immunodeficiency syndrome; AIDS; Addison's disease ; 

KW adult respiratory distress syndrome; allergy; cirrhosis; osteoporosis ; 

KW diabetes mellitus; Graves' disease; multiple sclerosis; osteoarthritis; 

KW rheumatoid arthritis; systemic lupus erythematosus; ulcerative colitis; 

KW haematopoietic cancer; neurological disorder; stroke; epilepsy; 

KW Huntington's disease; Parkinson's disease; meningitis; prion disease; 

KW kuru; Creutzfeldt- Jakob disease; cerebral palsy; myasthaenia gravis; 

KW diabetic neuropathy; Alzheimer's disease. 

XX 

OS Homo sapiens . 
XX 

PN WO200111032-A1. 
XX 

PD 15-FEB-2001. 
XX 

PF 01-JUN-2000; 2 000WO-US1524 6 . 
XX 

PR 05-AUG-1999; 99US- 14 75 OOP . 

PR 05-AUG-1999; 99US- 147501P . 
XX 

PA (INCY-) INCYTE GENOMICS INC. 
XX 

PI Hodgson DM, Lincoln SE, Russo FD, Spiro PA, Banville SC; 

PI Bratcher SR, Dufour GE, Cohen HJ, Rosen BH, Chalup MS, Hillman JL; 

PI Jones AL, Yu JY, Greenawalt LB, Panzer SR, Roseberry AM; 

PI Wright RJ, Daniels SE; 

XX 



DR WPI; 2002-147236/19. 
XX 

PT Novel secretory polynucleotide (sptm) and polypeptides encoded by sptm, 

PT useful for diagnosing and treating disorders or diseases associated 

PT with cell signaling e.g., allergy, psoriasis, Grave's disease, epilepsy 

PT 

XX 

PS Claim 1; Page 192-193; 198pp; English. 
XX 

CC This invention relates to novel cDNA molecules encoding isolated 

CC secretory polynucleotides (sptm) with similarity to signal peptide 

CC (SP) or transmembrane domain (TM) consensus sequences. The 

CC polynucleotide sequences of the invention are useful for producing 

CC sptm protein by recombinant techniques, the protein may be used to 

CC generate ant i -sptm antibodies which may be used to analyse protein 

CC expression levels in different tissues. The sptm molecules are useful 

CC for diagnostic and therapeutic purposes e.g., to diagnose or treat a 

CC condition associated with cell signaling such as a cell proliferative 

CC disorders (e.g., arteriosclerosis, atherosclerosis, psoriasis, cancers), 

CC immune system disorders (e.g., inflammation, acquired immunodeficiency 

CC syndrome (AIDS), Addison's disease, adult respiratory distress syndrome, 

CC allergies, cirrhosis, diabetes mellitus, gout, Graves' disease, 

CC multiple sclerosis, osteoarthritis, osteoporosis, rheumatoid arthritis, 

CC systemic lupus erythematosus, ulcerative colitis and haematopoietic 

CC cancer), a neurological disorder (e.g., stroke, epilepsy, Huntington's 

CC disease, Parkinson's disease, meningitis, prion diseases including kuru, 

CC Creutzfeldt- Jakob disease, cerebral palsy, myasthenia gravis, diabetic 

CC neuropathy and Alzheimer's disease) . Sptm sequences can be used to 

CC detect the presence of or quantifying the amount of sptm-related 

CC polynucleotide in a sample. The sptm polynucleotide is used to design 

CC probes useful in diagnostic assays carried out to detect or confirm 

CC conditions, disorders, or diseases associated with abnormal levels of 

CC sptm expression. Sptm, its fragments or oligonucleotides derived from 

CC sptm may be used as primers in amplification steps prior to 

CC hybridisation. The present sequence represents the human sptm (481257.3) 

CC cDNA sequence of the invention. 

XX 

SQ Sequence 3281 BP; 1014 A; 601 C; 676 G; 990 T; 0 other; 

Query Match 57.5%; Score 582.6; DB 24; Length 3281; 

Best Local Similarity 74.7%; Pred. No. 9.3e-152; 

Matches 748; Conservative 0; Mismatches 244; Indels 9; Gaps 1; 
Qy 18 GTTTAGTAAATCACACAAAAATCCAGC^GAAATTG 77 

1 1 1 1 I II 1 1 llllll MINIMI MINIM I Mill II I 1 1 1 1 

Db 101 GTTTGGGAAGTCTCA.CAAATCTCCAGCAGACATT 160 
Qy 78 CATTTTGGAAAAGCAAGAC AAAAAGA CAGA CAAGG CTT CAGAAGAAG TG T C 128 

II Mlllllillllll Mill 1 1 1 1 llllll llllllllll II 

Db 161 TGTTCTGGAAAAGCAAGACATTTCTGATAAAAAAGCAGAAAAGGCTACAGAAGA 22 0 

Qy 12 9 TAAATCACTGCAAGCAATGAAAGAAATTCTGTGTGGTACAAACGAGA 188 

Ml III II III MINIMI III I III Mill II Mill II I II 

Db 221 CAAAAATCTGGTTGCCATGAAAGAAATTCTGTATGGC^ 28 0 

Qy 18 9 AGAAG CAGTGG CTCAG CTAG CACAAGAACTCTACAG CAGTGG CCTGCTAGTGACACTGAT 248 

MMMMI Mill II II MINIM I INN II II II IN I 



Db 2 81 AGAAGCAGTAGCTCAACTTGCTCAAGAACTCTATAATAGTGGGCTCCTTAGCACCCTGGT 34 0 

Qy 24 9 AG CTGACCTGCAG CTGATAGACTTTGAGGGAAAAAAAGATGTGACC CAGATATTTAACAA 3 08 

I I I I I I I Mill II lllllllllll Illlllll III I II II II Mill 

Db 341 AGCTGATTTACAGCTCATTGACTTTGAGGGCAAAAAAGACGTGGCTCAAATTTTCAACAA 4 00 

Qy 3 09 CATCTTGAGAAGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGTGCTCATCC 368 

II I Illlllll II II II I I III Ml I II II II II I I II I 

Db 4 01 TATTCTCAGAAGACAAATTGGTACGAGAACTCCTACTGTTGAATACATCTGCACCCAACA 4 60 

Qy 369 TCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTACGTTGTGG 428 

MM MM III I I Mill MUM I III I II II II MUM 

Db 4 61 GAATATTTTGTTCATGTTATTGAAAGGGTATGAATCTCCAGAAATAGCTCTAAATTGTGG 52 0 

Qy 42 9 GATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTCTTTTCTAA 488 

I I I II I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 II 1 1 1 II II I 

Db 521 AATAATGTTAAGAGAATGCATCAGACATGAACCACTTGCAAAAATCATTTTGTGGTCGGA 58 0 

Qy 48 9 TCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCTTCAGATGC 548 

II II Mllllll I II II II I II 1 1 II I II III II I MM 1 1 MM 

Db 581 ACAGTTTTATGATTTCTTCAGATATGTCGAAATGTCAACATTTGACATAGCTTCAGATGC 64 0 

Qy 54 9 CTTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGACTTCTTAGA 6 08 

Mill II I II Mil II MM II I lllllllllll 

Db 641 ATTTGCi^CATTCAAGGATTTACTTAC^GACATAAATTGCTCAGTGCAGAATTTTTGGA 7 00 

Qy 609 ACAAAATTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAGAATTATGT 668 

Ml MM II I MM II Illlllll II Mill II II Illlllll 

Db 701 ACAG CATTATGATAGATTTTT CAGTGAATATGAGAAGTTACTTCATTCAGAAAATTATGT 760 

Qy 669 TACTAAGAGACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCACAACTTTGC 728 

II II Illlllll I Mill II II II II Mill I Illlllll I 

Db 761 GACAAAAAGACAGTCACTGAAGCTTCTCGGTGAACTACTACTAGATAGACACAACTTCAC 82 0 

Qy 72 9 CATCATGAC^AAGTATATCAGCAAGCCGGAGAACCTGAAACTCATGATGAACCTCCTTCG 788 

II Illlllll M Mill II II Mllllll III I IMIMIMM II II 

Db 821 AATTATGACAAAATACATCAGTAAACCTGAGAACCTCAAATTAATGATGAACCTGCTGCG 8 80 

Qy 78 9 GGATAAAAGTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTTGTGGCCAG 84 8 

II MM II I II II Illlllll III Illlllll 1 1 1 1 1 1 1 1 1 1 1 1 Ml MM 

Db 8 81 AGACAAAAGTCGCAACATCCAGTTTGAGGCCTTTCACGTTTTTAAGGTGTTTGTAGCCAA 94 0 

Qy 84 9 TCCTC^CmA^CACAGCCTATTGTGGAGATCCTGTTAAAAAATCAGCCCAAACTCATTGA 908 

MM MM II Mill M I II Mill I II II III Illlllll II 

Db 941 TCCTAACAAGACGCAGCCCATCCTAGACATCCTCCTCAAGAACCAGGCCAAACTCATAGA 1000 

Qy 909 GTTTCTGAGCAG CTTC CAAAAAGAAAGGACGGATGATGAG CAGTTCG CTGACGAGAAGAA 968 

III II MM II II II II Mllllll lllllllllll llllllllll 

Db 10 01 GTT CCT CAG CAAGTTT CAGAACGACAGGACGGAGGATGAG CAGTTTAACGACGAGAAGAC 1060 

Qy 96 9 CTACTTGATTAAACAGATCCGAGACTTGAAGAAAACGGCCC 1009 

Ml M lllllllllll I II MM I I II I 

Db 1061 CTATTTAGTTAAACAGATCAGGGATTTGAAGAGACCAGCTC 1101 



RESULT 10 
ABV22987 



ID ABV22987 standard; cDNA; 384 9 BP. 
XX 

AC ABV22987; 
XX 

DT 13-SEP-2002 (first entry) 
XX 

DE Human prostate expression marker cDNA 22978. 
XX 

KW Human; prostate cancer; cytostatic; carcinogen; pharmacodyanamic marker; 

KW pharmacogenomic marker; gene; ss. 

XX 

OS Homo sapiens . 
XX 

PN WO200160860-A2 . 
XX 

PD 23-AUG-2001. 
XX 

PF 20-FEB-2001; 2001WO-US05171 . 
XX 

PR 17-FEB-2000; 2000US- 183319P . 

PR 16-MAR-2000; 2 0 OOUS - 18 9862 P . 

PR 25-MAY-2000; 2 0 00US-2 07454 P . 

PR 09-JUN-2000; 2 0 00US-21 13 14 P . 

PR 18-JUL-2000; 2000US-219007P. 

PR 13-DEC-2000; 2 00 OUS -2552 8 IP . 
XX 

PA (MILL-) MILLENNIUM PREDICTIVE MEDICINE INC. 
XX 

PI Schlegel R, Endege WO, Monahan JE; 
XX 

DR WPI; 2001-662795/76. 
XX 

PT Novel isolated nucleic acid molecule associated with cancerous state of 

PT prostate cells and correlating with presence of prostate cancer, useful 

PT for detecting presence of prostate cancer, stage of prostate cancer 
XX 

PS Claim i; Page 4088; 11750pp; English. 
XX 

CC The invention relates to an isolated nucleic acid molecule (I) comprising 

CC a nucleotide sequence given in Tables 1-9 (ABV00010-ABV62213 ) of the 

CC specification or its complement. (I) is useful for: 

CC (a) assessing whether a patient is afflicted with prostate cancer; 

CC (b) monitoring the progression of prostate cancer in a patient; 

CC (c) assessing the efficacy of a test compound to inhibit prostate 

CC cancer in a patient; 

CC (d) assessing the efficacy of a therapy for inhibiting prostate cancer 

CC in a patient; 

CC (e) selecting a composition for inhibiting prostate cancer in a patient; 

CC (f) assessing the prostate cell carcinogenic potential of a compound; 

CC (g) determining whether prostate cancer has metastasized in a patient; 

CC (h) assessing the aggressiveness or indolence of prostate cancer in a 

CC patient; 

CC (I) is also useful as a pharmacodyanamic or pharmacogenomic marker. 
XX 

SQ Sequence 3849 BP; 1142 A; 745 C; 858 G; 1081 T; 23 other; 



Query Match 



57.5%; Score 582.6; DB 23; Length 3849; 



Best Local Similarity 74.7%; Pred. No. le-151; 

Matches 748; Conservative 0; Mismatches 244; Indels 9; Gaps 1; 



Qy 


18 


Db 


437 


Qy 


78 


Db 


497 


Qy 


129 


Db 


557 


Qy 


189 


Db 


617 


Qy 


249 


Db . 


677 


Qy 


309 


Db 


737 


Qy 


369 


Db 


797 


Qy 


429 


Db 


857 


Qy 


489 


Db 


917 


Qy 


549 


Db 


977 


Qy 


609 


Db 


1037 


Qy 


669 


Db 


1097 


Qy 


729 


Db 


1157 


Qy 


789 



GTTTAGTAAATCACACAAAAATC 7 7 

1 1 1 1 I II II 1 1 1 1 1 1 MINIMI 1 1 1 1 1 1 1 1 I Mill II I Mill 

GTTTGGGAAGTCTC^CAAATCTCC^GCAGACATTGTGAAGAATCTGAAGGAGAGCATGGC 4 96 



II MMMIMIMM Mill MM llllll llllllllll II 



TAAATCACTG(^GCAATGAAAGAAATTCTGTGTGGTACAAACGAGAAAGAACCCC(^C 188 

III III I! Illlllllllllllll III Mill II Mill II I II 

CAAAAATCTGGTTGCCATGAAAGAAATTCTGTATGGCACAAATGAAAAAGAGCCTCAGAC 616 
AGAAGCAGTGGCTCAGCTAGCACAAGAACTCTACAGCAGTGGCCTGCTAGTGACACTGAT 24 8 

lllllllll Mill II II IMMMMII I Mill M II II Ml I 

AGAAGCAGTAGCTCAACTTGCTCAAGAACTCTATAATAGTGGGCTCCTTAGCACCCTGGT 676 
AG CTGACCTG CAG CTGATAGACTTTGAGGGAAAAAAAGATGTGAC CCAGATATTTAACAA 3 08 

llllll I Mill II lllllllllll MINIM III I II II II Mill 

AGCTGATTTACAGCTCATTGACTTTGAGGGCAAAAAAGACGTGGCTCAAATTTTCAACAA 736 
CATCTTGAGAAGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGTGCTCATCC 368 

II I II M II II II lllllllll M II II I I II I 

TATTCTCAGAAGACAAATTGGTACGAGAACTCCTACTGTTGAATACATCTGCACCCAACA 796 



MM MM III I I Mill MINI I III I 1 1 M II llllll 



GATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTCTTTTCTAA 4 8 8 

II III I MINI II Illlllllllllllll II II II II I I II I 

AATAATGTTAAGAGAATGCATCAGACATGAACCACTTGCAAAAATCATTTTGTGGTCGGA 916 

TCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCTTCAGATGC 548 

II II IMIIMI I II II II 1 1 1 ! 1 1 1 1 1 1 1 1 1 II lllllllllll 

ACAGTTTTATGATTTCTTCAGATATGTCGAAATGTCAACATTTGACATAGCTTCAGATGC 976 

CTTTG CTACTTTCAAGGATTTACTAAC CAGACATAAAGTGTTGGTAG CAGACTT CTTAGA 608 

Mill II MMMIMIMM II lllllllll M I Mill II II II 

ATTTGCCACATTC^GGATTTACTTAC^GACATAAATTGCTCAGTGCAGAATTTTTGGA 103 6 

ACAAAATTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAGAATTATGT 668 

III MM II I 1 1 1 1 II IMIIMI II Mill II II IMIIMI 

ACAGCATTATGATAGATTTTTCAGTGAATATGAGAAGTTACTTCATTCAGAAAATTATGT 1096 



II II IMIIMI I Mill II II II II I II II I IMIIMI I 

;aqzvaaaagacagtcactgaagcttctcggtgaactactactagatagacacaacttcac 

atcatgacaaagtatatcagcaagccggagaacctgaaactcatgatgaacctccttcg 
II IMIIMI II Mill II II IMIIMI III I IMMMMII II II 

ATTATGACAAAATACATCAGTAAACCTGAGAACCTCAAATTAATGATGAACCTGCTGCG 
JGATAAAAGTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTTGTGGCCAG 

II Ml M I II II MM IMIIMI MMMMMMIMM MM 



Db 



1217 AGACAAAAGTCGCAACATC 1276 



Qy 84 9 TCCT^CAAAACACAGCCTATTGTGGAGATCCTGTTAAAAAATCAGCCCAAACTCATTGA 908 

I I I I I I I I II Mill II I I I Mill I II II III IMIMIMI II 

Db 1277 TCCTAAC^GACGCAGCCCATCCTAGACATCCTCCTC^GAACCAGGCCAAACTaVTAGA 133 6 

Qy 909 GTTTCTGAG CAGCTTC CAAAAAGAAAGGACGGATGATGAG CAGTTCG CTGACGAGAAGAA 968 

III II MM II II II II llllllll llllllllll llllllllll 

Db 1337 GTTCCTCAGCAAGTTTCAGAACGACAGGACGGAGGATGAGCAGTTTAACGACGAGAAGAC 1396 

Qy 969 CTACTTGATTAAACAGATCCGAGACTTGAAGAAAACGGCCC 1009 

III M Mlllllllll I II Mlllll I I II I 

Db 13 97 CTATTTAGTTAAACAGATCAGGGATTTGAAGAGACCAG CT C 143 7 



RESULT 11 
ABV28822 

ID ABV28822 standard; cDNA; 3849 BP . 
XX 

AC ABV2 8822; 
XX 

DT 16-SEP-2002 (first entry) 
XX 

DE Human prostate expression marker cDNA 28813. 
XX 

KW Human; prostate cancer; cytostatic; carcinogen; pharmacodyanamic marker; 

KW pharmacogenomic marker; gene; ss. 

XX 

OS Homo sapiens . 
XX 

PN WO200160860-A2 . 
XX 

PD 23-AUG-2001. 
XX 

PF 20-FEB-2001; 2 00 1WO-US05 171 . 
XX 

PR 17-FEB-2000; 2 00 0US- 1833 1 9P . 
PR 16-MAR-2000; 2000US- 18 9862P . 
PR 25-MAY-2000; 2 000US-207454P . 
PR 09-JUN-2000; 2 000US -2 1 13 14P . 
PR 18-JUL-2000; 2 000US -2 1 90 07P . 
PR 13-DEC-2000; 2000US-255281P . 
XX 

PA (MILL- ) MILLENNIUM PREDICTIVE MEDICINE INC. 
XX 

PI Schlegel R, Endege WO, Monahan JE; 
XX 

DR WPI; 2001-662795/76. 
XX 

PT Novel isolated nucleic acid molecule associated with cancerous state of 
PT prostate cells and correlating with presence of prostate cancer, useful 
PT for detecting presence of prostate cancer, stage of prostate cancer - 
XX 

PS Claim 1; Page 6066-6067; 11750pp; English. 
XX 

CC The invention relates to an isolated nucleic acid molecule (I) comprising 
CC a nucleotide sequence given in Tables 1-9 (ABV00010-ABV62213) of the 



CC specification or its complement. (I) is useful for: 

CC (a) assessing whether a patient is afflicted with prostate cancer; 

CC (b) monitoring the progression of prostate cancer in a patient; 

CC (c) assessing the efficacy of a test compound to inhibit prostate 

CC cancer in a patient; 

CC (d) assessing the efficacy of a therapy for inhibiting prostate cancer 

CC in a patient; 

CC (e) selecting a composition for inhibiting prostate cancer in a patient; 

CC (f) assessing the prostate cell carcinogenic potential of a compound; 

CC (g) determining whether prostate cancer has metastasized in a patient; 

CC (h) assessing the aggressiveness or indolence of prostate cancer in a 

CC patient ; 

CC (I) is also useful as a pharmacodyanamic or pharmacogenomic marker. 
XX 

SQ Sequence 3849 BP; 1142 A; 745 C; 858 G; 1081 T; 23 other; 



Query Match 57.5%; Score 582.6; DB 23; Length 3849; 

Best Local Similarity 74.7%; Pred. No. le-151; 

Matches 748; Conservative 0; Mismatches 244; Indels 9; Gaps 1; 



Qy 

Db 


18 
437 


GTTTAGTAAATCACACAAAAATCCAGCAGAAATTGTGAAAATC 

MM 1 M M MMM MMMMI MMMM I MM! M 1 MM 

GTTTGGGAAGTCT CA CAAATCT C CAG CAGACATTGTGAAGAATCTGAAGGAGAG CATGG C 


77 
496 


Qy 

Db 


78 
497 


CATTTTGGAAAAGCAAGAC AAAAAGACAGACAAGGCTTCAGAAGAAGTGTC 

M MMMMMMM MMI MM MMM MMMM II 

TGTTCTGGAAAAGCAAGAC^TTTCTGATAAAAAAGCAGAAAAGGCTACAGAAGAAGTTTC 


128 
556 


Qy 

Db 


129 
jj / 


TAAATCACTGCAAGCAATGAAAGAAATTCTC 

III III II llllllilllMIIII III Mill II Mill II 1 M 

pa aaa atptppttpppatp a a ap a a a ttptht a Tnrjra pa a a tp a a a a ap ap pptp a p a p 


188 

(DID 


Qy 


189 


AGAAG CAGTGG CT CAG CTAG CACAAGAACTCTACAG CAGTGG CCTG CTAGTGACACTGAT 

MMMM Mill II II MMMMI 1 Mill II II II Ml 1 

AGAAGCAGTAGCTCAACTTGCTCAAGAACTCTATAATAGTGGGCTCCTTAGCACCCTGGT 


248 


Db 


617 


676 


Qy 


249 


AGCTGACCTGCAGCTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATATTTAACAA 

Mill! 1 MMI M lllllllllll MMMM III 1 M M II MMI 

AGCTGATTTACAGCTCATTGACTTTGAGGG 


308 


Db 


677 


736 


Qy 


309 


CATCTTGAGAAGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGTGCTCATCC 

M 1 llllllll II II II II MMMMI M II II 1 1 II 1 

TATTCTCAGAAGACAAATTGGTACGAGAACTCCTACTGTTGAATACATCTGCACCCAACA 


368 


Db 


737 


796 


Qy 


369 


TCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTACGTTGTGG 

MM MM III 1 I Mill MMM 1 III 1 II II II MMM 

GAATATTTTGTTCATGTTATTGAAAGGGTATGAATCTCCAGAAATAGCTCTAAATTGTGG 


428 


Db 


797 


856 


Qy 


429 


GATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTCTTTTCTAA 

II III 1 MMMM II 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 MMMM Mill 

AATAATGTTAAGAGAATGCATCAGACATGAACCACTTGCAAAAATCATTTTGTGGTCGGA 


488 


Db 


857 


916 


Qy 


489 


TCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTO^ACATTTGATATTGCTTCAGATGC 

II II llllllll 1 II II II lllllllllllll II lllllllllll 

ACAGTTTTATGATTTCTTCAGATATGTCGAAATGTOUVCATTTGACATAGCTTCAGATGC 


548 


Db 


917 


976 


Qy 


549 


CTTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGACTTCTTAGA 


608 



Db 

Qy 

Db 

Qy 
Db 

Qy 
Db 

Qy 

Db 
Qy 

Db 

Qy 
Db 

Qy 
Db 



1 1 1 1 1 II MINIMUM II MMMMI 1 1 I Mill II II II 

977 ATTTGCCACATTCAAGGAT^ 1036 
60 9 ACAAAATTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAGAATTATGT 668 

III 1 1 1 1 II I 1 1 1 1 II MINIM M Mill II II MINIM 

103 7 ACAGCATTATGATAGATTTTTCAGTGAATATGAGAAGTTACTTCATTCAGAAAATTATGT 1096 



669 



728 



TACTAAGAGAGAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCACAACTTTGC 

II II INN I INN II II II II I II II I III I 

1097 GACAAAAAGACAGTCACTGAAGCTTCTCGGTGAACTACTACTAGATAGACACAACTTCAC 1156 
72 9 CAT CATGACAAAGTATATCAG CAAG CCGGAGAACCTGAAACTCATGATGAACCTC CTTCG 78 8 

II llllllll II Mill II II MINIM IN 1 NIININN II II 

1157 AATTATGACAAAATACATCAGTAAACCTGAGAACCTCAAATTAATGATGAACCTGCTGCG 1216 
78 9 GGATAAAAGTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTTGTGGCCAG 84 8 

N lllllll NIINIININN IIIIMN II 1 1 II I II 1 1 N 1 1 1 1 II 1 1 

1217 AGACAAAAGTCGCAACATCCAGTTTGAGGCCTTTCACGTTTTTAAGGTGTTTGTAGCCAA 1276 
84 9 TCCTCACAAAACACAGCCTATTGTGGAGATCCTGTTAAAAAATCAGCCC^ 908 

INI INI II INN II I II INN I II II 1 1 1 MMMMI II 

1277 TCCTAACAAGACGCAGCCCATCCTAGACATCCTCCTCAAGAACCAGGCCAAACTCATAGA 1336 



909 



968 



GTTT CTGAG CAG CTTCCAAAAAGAAAGGACGGATGATGAG CAGTTCG CTGACGAGAAGAA 

III II INI II II II II llllllll NIININN MMMMI 

133 7 GTT CCTCAG CAAGTTT CAGAACGACAGGACGGAGGATGAG CAGTTTAACGACGAGAAGAC 13 96 

96 9 CTACTTGATTAAACAGATCCGAGACTTGAAGAAAACGGCCC 1009 

III II I ! I I I i I I i I I I M lllllll I I II I 

13 97 CTATTTAGTTAAACAGATCAGGGATTTGAAGAGACCAGCTC 14 37 



RESULT 12 
AAF30688 
ID 
XX 
AC 
XX 
DT 
XX 
DE 
XX 
KW 
KW 
KW 
KW 
KW 
XX 
OS 
XX 
FH 
FT 
FT 
FT 
XX 
PN 
XX 



AAF30688 standard; cDNA; 1053 BP. 
AAF30688; 

ll-JUN-2001 (first entry) 

Human acute neuronal induced calcium binding protein ANIC-BP-1B cDNA. 

Acute neuronal induced calcium binding protein; ANIC-BP-1B; 
spice variant; human; stroke; head trauma; Parkinson's disease; 
Alzheimer's disease; multiple sclerosis; spinal cord injury; 
cerebroprotective; antiparkinsonian; nootropic; neuroprotective; 
therapy; diagnosis; vaccine; ss. 



Homo sapiens. 

Key 
CDS 



WO200125423-A1. 



Location/Qualifiers 
1. .1053 
/*tag= a 

/product= "Human ANIC-BP-1B" 



PD 12-APR-2001. 
XX 

PF 28-SEP-2000; 2 0 OOWO-EPO 9475 . 
XX 

PR 04-OCT-1999; 99EP-0119113 . 
XX 

PA (MERE ) MERCK PATENT GMBH. 
XX 

PI Duecker K, Den Daas I; 
XX 

DR WPI; 2001-266306/27. 

DR P-PSDB; AAB20387. 
XX 

PT Novel human acute neuronal induced calcium-binding protein like protein 

PT splice variant, useful for treating stroke, acute head trauma, 

PT Parkinson's disease, Alzheimer's disease multiple sclerosis, spinal 

PT cord injury - 

XX 

PS Claim 4; Page 43-44; 49pp; English. 
XX 

CC The present sequence is that of cDNA encoding a novel human acute 

CC neuronal induced calcium binding protein-like protein splice 

CC variant, ANIC-NP-1B (see AAB20387) . The protein shows homology to 

CC other members of the calcium binding protein family, including 

CC ANIC-BP, a protein discovered by mRNA differential display that is 

CC upregulated in a rat model of head trauma. ANIC-BP and ANIC-BP- IB 

CC differ in their C-terminal portions. The variant protein could 

CC serve as a novel drug target. The invention provides ANIC-BP-1B 

CC polynucleotides and polypeptides, expression vectors, host cells 

CC and antibodies, as well as methods for producing the protein and 

CC for treating or preventing disorders associated with expression of 

CC the protein by inhibiting or activating the action of ANIC-BP-1B. 

CC Diseases that may be treated include stroke and acute head trauma, 

CC Parkinson's disease, Alzheimer's disease, multiple sclerosis and 

CC spinal cord injury. The polynucleotides and polypeptides can also 

CC be used in diagnostic assays and in vaccines, and to identify 

CC agonists and antagonists useful for treating conditions associated 

CC with ANIC-BP-1B imbalance. 
XX 

SQ Sequence 1053 BP; 357 A; 211 C; 214 G; 271 T; 0 other; 

Query Match 53.4%; Score 541.6; DB 22; Length 1053; 
Best Local Similarity 74.1%; Pred. No. 1.5e-140; 

Matches 716; Conservative 0; Mismatches 23 9; Indels 11; Gaps 2; 

Qy 18 GTTTAGTAAATCACACAAAAATCCAGCAGAAATTGTGAAAATC^ 77 



Db 




Qy 



78 CATTTTGGAAAAGCAAGAC AAAAAGACAGACAAGGCTTCAGAAGAAGTGTC 128 





Db 



72 TGTTCTGGAAAAGCAAGACATTTCTGATAAAAAAGCAGAAAAGGCTACAGAAGAAGTTTC 13 



Qy 



12 9 TAAATCACTGCAAGCAATGAAAGAAATTCTGTGTGGTACAAACGAGAAAGAACCCCCAAC 18 8 



Db 



132 CAAAAATCTGGTTGCCATGAAAGAAATTCTGTATGGCACAAATGAAAAA 19 




Qy 


189 


Db 


192 


Qy 


249 


Db 


252 


Qy 


309 


Db 


312 


Qy 


369 


Db 


372 


Qy 


429 


Db 


432 


Qy 


489 


Db 


492 


Qy 


549 


Db 


552 


Qy 


609 


Db 


612 


Qy 


669 


Db 


672 


Qy 


729 


Db 


732 


Qy 


789 


Db 


792 


Qy 


849 


Db 


852 


Pit t 


Q C\ Q 

y u y 


Db 


912 


Qy 


967 


Db 


972 



AGAAGCAGTGGCTCAGCTAGCACAAGAACTCTACAGCAGTGGCCTGCTAGTGACACTGAT 24 8 

lllllll Mill II II IIIIIIIIIII I Mill II M II III I 

AGAAGOVGTAGCTCAACTTGCTCAAGAACTCTATAATAGTGGGCTCCTTAGCACCCTGGT 251 
AGCTGACCTGCAGCTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATATTTAACAA 3 08 

llllll I Mill II IIIIIMIIII IIIIIIII III I II II II Mill 

AGCTGATTTACAGCTCATTGACTTTGAGGGCAAAAAAGACGTGGCTCAAATTTTCAACAA 311 



II I IIIIIIII II II II II MINIMI II II II I I II I 



TCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTACGTTGTGG 428 

MM MM III I I Mill IIIIM I Ml I II II II llllll 

GAATATTTTGTTCATGTTATTGAAAGGGTATGAATCTCCAGAAATAGCTCTAAATTGTGG 431 

GATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTCTTTTCTAA 4 88 

II III I IIIIIIII II MMMMMIIMM IIIIIIII I I II I 

AATAATGTTAAGAGAATGC^TCAGACATGAACCACTTGCAAAAATCATTTTGTGGTCGGA 4 91 

TCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCTTCAGATGC 54 8 

II II IIIIIIII I II II II 1 1 1 ! f 1 1 1 1 1 i 1 1 II IIIIIIIIIII 

ACAGTTTTATGATTTCTTCAGATATGTCGAAATGTCAACATTTGACATAGCTTCAGATGC 551 

CTTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGACTTCTTAGA 608 

Mill II 1 1 II I II II Ml 1 1 II II MM II I II I Mill M II II 

ATTTGCCACATTCAAGGATTTACTTACAAGACATAAATTGCTCAGTGCAGAATTTTTGGA 611 



III MM II I MM II IIIIIIII II Mill II II IIIIIIII 

ACAGCATTATGATAGATTTTTCAGTGAATATGAGAAGTTACTTCATTCAGAAAATTATGT 
TACTAAGAGACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCACAACTTTGC 

II II IIIIIIII I MMI M II II II I II II I IIIIIIII I 

GACAAAAAGACAGTCACTGAAGCTTCTCGGTGAACTACTACTAGATAGACACAACTTCAC 
CATCATGACAAAGTATATCAGO^GCCGGAGAACCTGAAACTCATGATGAACCTCCTTCG 

II MINIM II Mill II II IIIIIIII III I IIIIIIIIIII II II 

AATTATGACAAAATACATCAGTAAACCTGAGAACCTCAAATTAATGATGAACCTGCTGCG 
GGATAAAAGTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTTGTGGCCAG 

II lllllll IMIIMIMMIII IIIIIIII Ml Ml 1 1 1 1 1 1 1 MM MM 

AGACAAAAGTCGCAACATCCAGTTTGAGGCCTTTCACGTTTTTAAGGTGTTTGTAGCCAA 
TCCTCACAAAACACAGCCTATTGTGGAGATCCTGTTAAAAAATCAGCCCAAACTCATTGA 

MM MM II Mill II I II Mill I II II III MIIIIMM II 

TCCTAACAAGACGCAGCCCATCCTAGACATCCTCCTCAAGAACCAGGCCAAACTCATAGA 
GTTTCTGAGCAGCTTCCAAAAAGAAAGGACGGAT - - GATGAGCAGTTCGCTGACGAGAAG 

III II MM II II M M MINIMI IIIIIIIIIII I III I 



RESULT 13 
AAS89557 

ID AAS89557 standard; cDNA; 1162 BP. 
XX 

AC AAS89557; 
XX 

DT 13-FEB-2002 (first entry) 
XX 

DE DNA encoding novel human diagnostic protein #25361. 
XX 

KW Human; chromosome mapping; gene mapping; gene therapy; forensic; 

KW food supplement; medical imaging; diagnostic; genetic disorder; ss. 
XX 

OS Homo sapiens . 
XX 

PN WO200175067-A2 . 
XX 

PD ll-OCT-2001. 
XX 

PF 30-MAR-2001; 2 001WO-US0863 1 . 
XX 

PR 31-MAR-2000; 2 0 0 OUS- 054 02 17 . 

PR 23-AUG-2000; 2 000US- 064 9167 . 
XX 

PA (HYSE-) HYSEQ INC. 
XX 

PI Drmanac RT, Liu C, Tang YT; 
XX 

DR WPI; 2001-639362/73. 

DR P-PSDB; ABG25370. 
XX 

PT New isolated polynucleotide and encoded polypeptides, useful in 

PT diagnostics, forensics, gene mapping, identification of mutations 

PT responsible for genetic disorders or other traits and to assess 

PT biodiversity 
XX 

PS Claim 1; SEQ ID No 25361; 103pp ; English. 
XX 

CC The invention relates to isolated polynucleotide (I) and 

CC polypeptide (II) sequences. (I) is useful as hybridisation probes, 

CC polymerase chain reaction (PCR) primers, oligomers, and for chromosome 

CC and gene mapping, and in recombinant production of (II). The 

CC polynucleotides are also used in diagnostics as expressed sequence tags 

CC for identifying expressed genes. (I) is useful in gene therapy techniques 

CC to restore normal activity of (II) or to treat disease states involving 

CC (II) . (II) is useful for generating antibodies against it, detecting or 

CC quant i tat ing a polypeptide in tissue, as molecular weight markers and as 

CC a food supplement. (II) and its binding partners are useful in medical 

CC imaging of sites expressing (II). (I) and (II) are useful for treating 

CC disorders involving aberrant protein expression or biological activity. 

CC The polypeptide and polynucleotide sequences have applications in 

CC diagnostics, forensics, gene mapping, identification of mutations 

CC responsible for genetic disorders or other traits to assess biodiversity 

CC and to produce other types of data and products dependent on DNA and 

CC amino acid sequences. AAS64 197-AAS94564 represent novel human 

CC diagnostic coding sequences of the invention. 

CC Note: The sequence data for this patent did not appear in the printed 



CC specification, but was obtained in electronic format directly from WIPO 

CC at f tp . wipo . int/pub/published_pct_sequences . 

XX 

SQ Sequence 1162 BP; 383 A; 241 C; 258 G; 280 T; 0 other; 



Query Match 53.2%; Score 53 9.6; DB 23; Length 1162; 

Best Local Similarity 73.9%; Pred. No. 5.6e-140; 

Matches 743; Conservative 0; Mismatches 249; Indels 14; Gaps 4; 
Qy 18 GTTTAGTAAATCACACAAAAATCCAGCAGAAATTGTGAAAATCCTGAAAGACAATTTGGC 77 

II 1 1 I I! II llllll lllllllll IIIIIIM I Mill II I MM 

Db 143 GTTTGGGAAGTCTCACAAATCTCCAGCAGACATTGTGAAGAATCTGAAGGAGAGCATGGC 2 02 
Qy 78 CATTTTGGAAAAGCAAGAC AAAAAGACAGACAAGGCTTCAGAAGAAGTGTC 128 

II MIMMMIMM Mill MM MMM MMMMI! II 

Db 2 03 TGTT CTGGAAAAG CAAGACATTTCTGATAAAAAAG CAGAAAAGG CTACAGAAGAAGTTTC 2 62 



Qy 12 9 TAAATCACTGCAAGCAATGAAAGAAATTCTGTGTGGTACAAACGAGAAAGAA^ 188 

III III II IN MMI 1 1 M 1 1 1 1 III Mill II Mill II I II 

Db 2 63 CAAAAATCTGGTTGCCATGAAAGAAATTCTGTATGGCACAAATGAAAAAGATCCTCAGA 322 

Qy 18 9 AGAAGCAGTGGCTCAGCTAGCAC^GAACTCTACAG 24 8 

MMMM Mill II II IMMIIIIII I Mill II II I II III I 

Db 323 AGAAGCAGGAGCTCAACTTGCTCAAGAACTCTATAATAGTGGGCTCCTTATCACCCTGGT 3 82 



Qy 24 9 AGCTGAC CTG CAG CTGATAGACTTTGAGGGAAAAAAAGATGTGAC CCAGATATTTAACAA 308 

MMM I Mill II MMMMI MMMM Ml I II II M Mill 

Db 3 83 AGCTGATTTACAGCTCATTGACTTTGAGGGCAAAAAAGAC^ 442 

Qy 3 09 CATCTTGAGAAGACAGATAGGCA - CTCGGAGTCCTACTGTGGAGTATATTAGTGCTCATC 367 

II I MMMM llllll II MMMMI II llllll II 

Db 443 TATTCTCAGAAGACAAATTGGTACCGAGAACTCCTACTGTTGAATA 502 



Qy 368 CTCA- -TATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATT- -GCCTTACGT 423 

I I I MM III I I Mill MMM III II II I I 

Db 5 03 CAGAATATTTTTGTTCATGTTATTGAAAGGGTATGAATCTCCCAGAAATAGCTCTAAATT 562 

Qy 424 TGTGGGATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTCTTT 483 

Mill II III I MMMM II llllllllllllll! lllllllll I I 

Db 563 TGTGGAATAATGTTAAGAGAATGCATCAGACATGAACCACT 622 

Qy 4 84 TCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCTTCA 543 

II I II II MMMM I II II II 1 1 1 1 ! 1 1 1 E 1 1 1 1 II llllll 

Db 623 TCGGAA(^GTTTTATGATTTCTTC^GATATGTCGAAATGTCAACATTTGACATAGCTTCA 682 



Qy 544 GATGCCTTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGACTTC 603 

Mill Mill M MIMMMIMM II MMMMI II I Mill M 

Db 683 GATGCATTTGCCACATTCAAGGATTTACTTAC 742 

Qy 604 TTAGAACAAAA.TTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAGAAT 663 

II Mill MM II I MM II MMMM II Mill II II Ml 

Db 743 TTGGAACAGCATTATGATAGATTTTTCAGTGAATATGAGAAGTTACTTCATTCAGAAAAT 8 02 



Qy 664 TATGTTACTAAGAGACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCACAAC 723 

Mill II II MMMM I Mill II II II II I II II I llllll 

Db 803 TATGTGACAAAAAGACAGTCACTGAAGCTTCTCGGTGAACTACTACTAGATAGACACAAC 862 



Qy 724 TTTGCCATCATGACAAAGTA 783 

II I II llllllll II IIIII II II llllllll III I Illllllllll 

Db 863 TTCACAATTATGACAAAATAC^TCAGTAAACCTGAGAACCTCAAATTAATGATG 922 

Qy 784 CTTCGGGATAAAAGTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTTGTG 843 

II II II lllllll lllllllllllllll llllllll IIIIMIIIIIIIIIII 

Db 923 CTGCGAGACAAAAGTCGC^CATCCAGTTTGAGGCCTTTCACGTTTTTAAGGTGTTTGTA 982 

Qy 844 GCCAGTCCTCACAAAACAC^GCCTATTC^ 903 

I I I I I I I I I I I I II IIIII II I II IIIII I II II III llllllll 

Db 983 GCCAATCCTAACAAGACGCAGCCCATC 1042 

Qy 904 ATTGAGTTTCTGAGCAG CTT CCAAAAAGAAAGGACGGATGATGAG CAGTTCG CTGACGAG 963 

II Mill II MM II II II II llllllll MIMMMM IIIIII 

Db 1043 ATAGAGTTCCT CAG CAAGTTTCAGAACGACAGGACGGAGGATGAG CAGTTTAACGACGAG 1102 

Qy 964 AAGAACTACTTGATTAAACAGATCCGAGACTTGAAGAAAACGGCCC 1009 

MM III II Illllllllll I II lllllll IIIII 

Db 1103 AAGAC CTATTTAGTTAAACAGATCAGGGATTTGAAGAGAC CAG CTC 1148 

RESULT 14 
AAX39818/C 

ID AAX39818 standard; DNA; 833 BP. 
XX 

AC AAX39818; 
XX 

DT 02-JUL-1999 (first entry) 
XX 

DE Gastric cancer associated gene. 
XX 

KW Cancer associated antigen; diagnosis; research; treatment; human; 

KW breast cancer; colon cancer; gastric cancer; renal cancer; lung cancer; 

KW prostate cancer; ss. 

XX 

OS Homo sapiens. 
XX 

PN WO9904265-A2 . 
XX 
PD 
XX 

PF 15 -JUL- 19 98 ; - iW< i-US14( ' >. 
XX 
PR 
PR 
PR 
PR 
PR 
PR 
XX 

PA (LUDW-) LUDWIG INST CANCER RES . 
XX 

PI Chen Y, Gout I, Gure A, O'Hare M, Obata Y, Old LJ; 
PI Pfreundschuh M, Sahin U, Scanlan MJ, Stockert E; 
PI Tureci O; 
XX 

DR WPI; 1999-132448/11. 



28 


-JAN- 


1999. 


15 


-JUL- 


1998; 


22 


-JUN- 


1998; 


17 


-JUL- 


1997; 


10 


-OCT- 


1997; 


10 


-OCT- 


1997; 


10 


-OCT- 


1997; 


11 


-OCT- 


1997; 



98US-0102322. 
97US-0896164. 
97US-0061599. 
97US-0061765. 
97US-0948705. 
97GB-0021697. 



PT New isolated cancer associated nucleic acids and polypeptides - 

PT isolated using sera from cancer patients, used to develop products 

PT for the diagnosis, monitoring or treatment of cancers 
XX 

PS Claim 67; Page 559; 787pp ; English. 
XX 

CC The invention relates to a method for diagnosing a disorder characterised 

CC by expression of a human cancer associated antigen precursor coded for by 

CC a nucleic acid molecule (NAM). The method comprises: (a) contacting a 

CC biological sample isolated from a subject with an agent that specifically 

CC binds to the NAM, an expression product or a fragment of an expression 

CC product complexed with an HLA molecule; and (b) determining the 

CC interaction between the agent and the NAM or the expression product as a 

CC determination of the disorder. The products and methods can be used in 

CC the diagnosis, monitoring, research, or treatment of conditions 

CC characterised by the expression of various cancer associated antigens. 

CC The invention provides nucleic acid sequences and encoded polypeptides 

CC which are cancer associated antigen precursors expressed in human breast 

CC cancer, renal cancer, colon cancer, gastric cancer, prostate cancer and 

CC lung cancer. 

XX 

SQ Sequence 833 BP; 253 A; 171 C; 172 G; 227 T; 10 other; 

Query Match 51.3%; Score 520.2; DB 20; Length 833; 

Best Local Similarity 98.1%; Pred. No. 1.2e-134; 

Matches 566; Conservative 0; Mismatches 7; Indels 4; Gaps 4 
Qy 442 GAATGTATTCGACATGAACCACTTG - CCAAAATCATCCTC - TTTTCTAATCAATTCAGAG 4 99 

1 1 1 1 llllllll III III III Mill MMMM 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 732 GAATNTATTCGACTTGACCCANTTGCCCAAANTCATCCTCTTTTTCTAATCAATTCAGAG 673 

Qy 500 ATTTCTTTAAGT-ACGTGGAGTTGTCAACATTTGATATTGCTTCAGATGCCTTTGCTACT 558 

M II II I I I I I I < M I I I I I I I : I ; I I [ I I i M I I I I I i I I M I 

Db 672 ATTTCTTTAAGTAACGTGGAGTTGTCAACATTTGATATTGCTTCAGATGCCTTTGCTACT 613 

Qy 55 9 TTCAAGGATTTACTAACCAGA - CATAAAGTGTTGGTAGCAGACTTCTTAGAACAAAATTA 617 

1 1 1 1 1 1 1 ! 1 1 1 1 ' I ! I II II I I ! II 1 1 M I ; I M M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 

Db 612 TTCAAGGATTTACTAACCNGACCTTAAAGTGTTGGTAGCAGACTTCTTAGAACAAAATTA 553 

Qy 618 CGACACTATTTTTGAAGACTATGAGAAATTG CTTCAGT CTGAGAATTATGTTACTAAGAG 677 

1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 II 1 1 1 Ml 1 1 Ml 1 1 1 Mill II I II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 

Db 552 CGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAGAATTATGTTACTAAGAG 4 93 

Qy 678 ACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCACAACTTTGCCATCATGAC 737 

1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 M I M 1 1 1 1 1 M 1 1 1 1 M II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M I M M 

Db 4 92 ACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCACAACTTTGCCATCATGAC 433 

Qy 738 AAAGTATATCAG CAAG C CGGAGAAC CTGAAACTCATGATGAAC CT C CTTCGGGATAAAAG 797 

1 1 1 M M M M M 1 1 1 1 M 1 1! I II 1 1 M I M I M 1 1 1 1 1 M I M I M 1 1 1 II 1 1 1 M I 

Db 432 AAAGTATATCAGCAAGCCGGAGAACCTGAAACTCATGATGAACCTCCTTCGGGATAAAAG 373 

Qy 798 TCCCAACATCCAGTTTGAAGCCTTTC^TGTTTTTAAGGTGTTTGTGGCCAGTCCTCAO^ 857 

M 1 1 1 1 1 M M 1 1 1 1 1 1 1 1 1 Ml 1 1 1 Ml M 1 1 1 M 1 1 1 1 M 1 1 1 1 M M 1 1 1 1 1 1 M I 

Db 372 TCCCAAC^TCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTTGTGGCCAGTCCTCACAA 313 



Qy 



8 58 AACACAGCCTATTGTGGAGATCCTGTTAAAAAATCAGCCCAAACTCATTGAGTTTCTGAG 917 



Db 




Qy 



918 CAG CTTCCAAAAAGAAAGGACGGATGATGAG CAGTTCG CTGACGAGAAGAACTACTTGAT 977 



Db 




Qy 



978 TAAACAGATCCGAGACTTGAAGAAAACGGCCCCTTGA 1014 



Db 




RESULT 15 
AAS88031 

ID AAS88031 standard; cDNA; 2492 BP. 
XX 

AC AAS88031; 
XX 

DT 13-FEB-2002 (first entry) 
XX 

DE DNA encoding novel human diagnostic protein #23835. 
XX 

KW Human; chromosome mapping; gene mapping; gene therapy; forensic; 

KW food supplement; medical imaging; diagnostic; genetic disorder; ss. 
XX 

OS Homo sapiens. 
XX 

PN WO200175067-A2 . 
XX 

PD ll-OCT-2001. 
XX 

PF 30-MAR-2001; 2001WO-US08631 . 
XX 

PR 31-MAR-2000; 2 000US - 054 02 17 . 

PR 23-AUG-2000; 2 0 00US- 064 9167 . 
XX 

PA (HYSE-) HYSEQ INC. 
XX 

PI Drmanac RT, Liu C, Tang YT; 
XX 

DR WPI; 2001-639362/73. 

DR P-PSDB; ABG23844. 
XX 

PT New isolated polynucleotide and encoded polypeptides, useful in 

PT diagnostics, forensics, gene mapping, identification of mutations 

PT responsible for genetic disorders or other traits and to assess 

PT biodiversity 
XX 

PS Claim 1; SEQ ID No 23835; 103pp ; English. 
XX 

CC The invention relates to isolated polynucleotide (I) and 

CC polypeptide (II) sequences. (I) is useful as hybridisation probes, 

CC polymerase chain reaction (PCR) primers, oligomers, and for chromosome 

CC and gene mapping, and in recombinant production of (II) . The 

CC polynucleotides are also used in diagnostics as expressed sequence tags 

CC for identifying expressed genes. (I) is useful in gene therapy techniques 

CC to restore normal activity of (II) or to treat disease states involving 



CC (II). (II) is useful for generating antibodies against it, detecting or 

CC quantitating a polypeptide in tissue, as molecular weight markers and as 

CC a food supplement. (II) and its binding partners are useful in medical 

CC imaging of sites expressing (II) . (I) and (II) are useful for treating 

CC disorders involving aberrant protein expression or biological activity. 

CC The polypeptide and polynucleotide sequences have applications in 

CC diagnostics, forensics, gene mapping, identification of mutations 

CC responsible for genetic disorders or other traits to assess biodiversity 

CC and to produce other types of data and products dependent on DNA and 

CC amino acid sequences. AAS64197-AAS94 564 represent novel human 

CC diagnostic coding sequences of the invention . 

CC Note: The sequence data for this patent did not appear in the printed 

CC specification, but was obtained in electronic format directly from WIPO 

CC at ftp. wipo. int/pub/published_pct_sequences . 
XX 

SQ Sequence 2492 BP; 751 A; 477 C; 546 G; 718 T; 0 other; 



Query Match 48.9%; Score 496; DB 23; Length 2492; 

Best Local Similarity 73.0%; Pred. No. l.le-127; 

Matches 737; Conservative 0; Mismatches 255; Indels 18; Gaps 7; 



Qy 


18 


GTTTAGTAAATC^CACAAAAATCCAGCAGAAAT^ 

MM 1 II II llllll III MUM Ml Mill 1 Mill II 1 MM 

GTTTGGGAAGTCTCACAAATCTCCAGCAGACATTGTG 


77 


Db 


143 


202 


Qy 


78 


CATTTTGGAAAAGCAAGAC AAAAAGACAGACAAGGCTTCAGAAGAAGTGTC 

II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 MMI MM llllll MINIM II 

TGTTCTGGAAAAG CAAGACATTTCTGATAAAAAAG CAGAAAAGG CTACAGAAGAAGTTT C 


128 


Db 


203 


262 


Qy 


129 


TAAATCACTGCAAGCAATGAAAGAAATTCTGTGTGGTA 

III III II MMMMMIIMM Ml Mill 1 1 Mill II 1 1! 

CAAAAATCTGGTTGCCATGAAAGAAATTCTGTATGGCACAAATGAAAAAGATCCTCAGA^ 


188 


Db 


263 


322 


Qy 


189 


AGAAGCAGTGGCTCAGCTAGCACAAGAACTCTACAG 

MIMMI Mill II II MMMMMI 1 Mill II 1 1 1 II III 1 

AGAAGCAGGAGCTCAACTTGCTCAAGAACTCTATAATAGTGGGCTCCTTATCACCCTGGT 


248 


Db 


323 


382 


Qy 


249 


AG CTGACCTG CAG CTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATATTTAACAA 

llllll 1 Mill II MMMMMI MIMMI III 1 II II II Mill 

AGCTGATTTACAGCTCATTGACTTTGAGGGCAAAAAAGACGTGGCT 


308 


Db 


383 


442 


Qy 


309 


CATCTTGAGAAGACAGATAGG CA - CTCGGAGTCCTACTGTGGAGTATATTAGTGCTCATC 

II 1 MIMMI II II 1 1 II MINI II llllll 11 

TATTCTCAGAAGACAAATTGGTACCGAGAACTCCTACTGTTGAATACATCTGCACCCAAA 


367 


Db 


443 


502 


Qy 


368 


CTCA- - TATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATT - -GCCTTACGT 

1 1 1 MM III 1 1 Mill llllll 1 II II III 1 

CAGAATATTTTTGTTCATGTTATTGAAAGGGTATGAATCTCCCAGAAATAGCTCTAAATT 


423 


Db 


503 


562 


Qy 
Db 


424 
563 


TGTGGGATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTCTTT 

Mill II III 1 MINIM II 1 1 1 III 1 1 1 M II 1 MINIMI 1 1 

TGTGGAATAATGTTAAGAGAATGCATCAGACATGA 


483 
622 


Qy 


484 


TCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCTTCA 

II 1 II II III I || || || llllllillilll ll llllll 

TCGGAACAGTTTTATGATTTCTTCAGATATGTCGAAATGTCAACATTTGACATAGCTTCA 


543 


Db 


623 


682 



Qy 


544 


Db 


683 


Qy 


603 


Db 


743 


Qy 


663 


Db 


803 


Qy 


723 


Db 


863 


Qy 


781 


Db 


923 


Qy 


840 


Db 


983 


Qy 


900 


Db 


1043 


Qy 


960 


Db 


1103 



GATGCC - TTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGACTT 602 

llllll Mill M IMIMI MINI II MINIMI II I Mill II 

GATGCCATTTGCCACATTCA^ 742 
CTTAGAAOWUVTTACGAC^CTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAGAA 662 

II Mill MM II I MM II MIIIMI II Mill II II II 

TTTGGAACAGCATTATGATAGATTTTTCAGTGAATATGAGAAGTTACTTCATTCAGAAAA 8 02 
TTATGTTACTAAGAGACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCACAA 722 

llllll II II MIIIMI I Mill II II II II I II II I Mill 

TTATGTGACAAAAAGACAGTCACTGAAGCTTCTCGGTGAACTACTACTAGATAGACACAA 862 
CTTTG CCATCATGACAAAGTATAT CAG CAAG C CGGAG - - AACCTGAAACTCATGATGAAC 78 0 

III I M MIIIMI II Mill MINIM II I IMIMIII 

CTTCACAATTATGACAAAATACATCAGTAAACCTGTGGAACCTCAAATTTAATGATGAAC 922 
CTCCTTCGGGATAAAAGTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGG - TGTT 83 9 

M II M II IMIMI MIMIIMIIMM Ml 1 1 1 1 1 1 i i 1 1 Ml 

CTGCTGCGAGACAAAAGTCGCAACATCCAGTTTGAGGCCTTTCACGTTTTTAAGGCAGTT 982 
TGTGGCCAGTCCTCACAAAACACAGCCTATTGTGGAGATCCTGTTAAAAAATCAGCCCAA 8 99 

III MM MM MM II Mill II I M Mill I II II III MM 

TGTAGCCAATCCTAACAAGACGCAGCCCATCCTAGACATCCTCCTCAAGAACCAGGCCAA 1042 



llllll Mill II MM II II II II MIIIMI MINIMI! 

ACTCATAGAGTTCCTCAGCAAGTTTCAGAACGACAGGACGGAGGATGAGCAGT' 
CGAGAAGAACTACTTGATTAAACAGAT CCGAGACTTGAAGAAAACGG C C C 1 0 1 

1 1 1 1 1 1 1 1 III II IMMIIMM I II IMIMI I I II I 



Search completed: January 6, 2004, 01:28:41 
Job time : 394 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 



Run on: 



Title: 

Perfect score: 
Sequence: 

Scoring table: 



January 6, 2004, 02:35:04 ; Search time 1394 Seconds 

(without alignments) 
2517.743 Million cell updates/sec 

US-10-088-872-1 
1014 

1 atgaaaaaaatgcctttgtt tgaagaaaacggccccttga 1014 

I DENT I TY_NUC 

Gapop 10.0 , Gapext 1.0 



Searched: 



2263443 seqs, 1730637950 residues 



Total number of hits satisfying chosen parameters: 4526886 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post -processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Database : Publ ished_Appl ications_NA : * 

1 : /cgn2_6/ptodata/l/pubpna/US07_PUBCOMB. seq: * 

2 : /cgn2_6/ptodata/l/pubpna/PCT_NEW__PUB . seq: * 

3 : /cgn2_6/ptodata/l/pubpna/US06_NEW_PUB . seq: * 

4 : /cgn2_6/ptodata/l/pubpna/US06_PUBCOMB. seq: * 

5 : /cgn2_6/ptodata/l/pubpna/US07_NEW_PUB. seq: * 

6 : /cgn2_6/ptodata/l/pubpna/PCTUS_PUBC0MB . seq: * 

7 : /cgn2_6/ptodata/l/pubpna/US08_NEW_PUB . seq : * 

8 : / cgn2_6 /pt oda t a / 1 /pubpna/US 0 8_PUBCOMB . seq : * 

9 : /cgn2_6/ptodata/l/pubpna/US09A__PUBCOMB. seq: * 
10 : /cgn2_6/ptodata/l/pubpna/US09B_PUBCOMB. seq:* 
11 : /cgn2_6/ptodata/l/pubpna/US09C_PUBCOMB. seq: * 
12 : /cgn2_6/ptodata/l/pubpna/US09_NEW_PUB. seq: * 
13 : /cgn2_6/ptodata/l/pubpna/US09_NEW_PUB. seq2 : * 
14 : /cgn2_6/ptodata/l/pubpna/US10A_PUBCOMB. seq: * 
15 : /cgn2_6/ptodata/l/pubpna/US10B_PUBCOMB. seq: * 
16: / cgn2_6 /p t oda t a / 1 /pubpna/US 1 0_NEW_PUB .seq:* 
17: / cgn2_6 /p t oda t a / 1 /pubpna /US 6 0_NEW_PUB . s eq : * 
18 : /cgn2_6/ptodata/l/pubpna/US60_PUBCOMB . seq: * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 

o, 
o 

Result Query 

No. Score Match Length DB ID Description 





1 


1014 


100 


o 


1421 


13 


US- 10- 117-722 -111 


JCL^UCJIUC XXX, r\L/L> 




2 


1014 


100 


o 


1421 


15 


US -10-037-270-111 


^PmiPTirP 111 A r\y~\ 
uC^UCilLC XXX, iiUU 




3 


1010 


. 8 


99 


7 


1344 




t to _ -i 0-09^-730-9 

UO X W VJ £i ZD 1 ZD \J 


Q Ci^Tl 1 QTI f~» Ci O 7\ T— \-r"x "1 -I 




4 


398 


3 9 


3 


475 


11 


US - 09-91 ft-99 c i-S341 
kjzd \j j zj xo ^ y -) j jt j 


Cprnipnro R "5 zl 'J A-r-\ 




5 


288 


. 8 


28 


5 


690 


9 


US - 09-91 0-943-31 ft 






6 


246 


. 4 


24 


3 


435 


10 


US -09-867-701-5263 


Q n"\ ~\ f^TI C* ^ R 9 f> "X An 
lD Li CI 1LC J16DJ , 




7 


244 


. 8 


24 


1 


447 


10 


US -09-867-701-5899 


Cprfiip'nr'p CT Q Q Q A-r\ 

oci^u ci i l. c zj o y y f 




8 


244 


. 8 


24 


1 


450 


10 


US -09-867-701 -4 9 S3 


Cprfiipnpp A-r^ 

DC4UCIILC 




9 


210 


. 8 


20 


8 


762 


9 


US- 09 -910 -941 - 1 5 


ocL^LlcrllL- c JO, i-ipp± 


c 


10 


195 


19 


2 


387 


10 


US -09-954-456-14S3 


Opni lonr^d 1 A ^ 7 A-n 
ocyuciiLc 11 j j , 


c 


11 


195 


19 


2 


387 


10 


US -09-880-107-481 


C <=»m lpripp zl Q 1 A 

JCL|UC11LC *± O X , ripjj 




12 


169 


.8 


16 


7 


722 


13 


US -10-257-82 6A -118 


Cpffiipripp 1 1 Q Ann 

oeyuciiLc xxo , 




13 


166 


.6 


16 


4 


700 


1 3 


US-1 0-9R7-R96A-1 1 9 

\J%Z> X v £i J) 1 O/jDn 11 J 


Com ion no 1 1 Q A y~\*~\ 

oequciice ±±y f App 




14 


156 


15 


4 


861 


9 


US-09-770-44R-R99 


Q prti i ot*i ^ ex CQO A 
OCtJUcHC Dy&f App 




15 


74 


.2 


7 


3 


262 


9 


TTCI-09-993-A7£-1 9R1 


oequence izoi, Ap 




16 


74 


.2 


7 


3 


262 


12 


US-09-993-R76-1 9 SI 


Qpmipripp 1 O 1 An 
ofcrLjUtrllC c /\p 


c 


17 


65 


.6 


6 


5 


336 


11 


US -09-918-99S-1 9 0 6 9 


Cprntpnr'D 1 QACQ A 




18 


65 


6 . 


4 


487 


12 


US-1 0-949-1SS-191 

U O X U 1 Z J J J J) ill 


oequence j5Z j , App 




19 


65 


6 . 


4 


487 


13 


US-1 0-0RO-9S4-S4 

W XVJ UO U /i Jl Jl 


OtrLJUcllCcr D^i , App± 




20 


53 


. 6 


5 . 


3 


254 


10 


US-09-R7R-S74-1 3 3 6 9 




c 


21 


50 


. 6 


5 . 


o 


486 


11 


US -09-770-961 -777 

UU \J J 1 1 \J y vj X III 


CpmipriPP 7 7*7 Ann 
OCL|Ut:Ilv-.C III, -H-PP 




22 


41 


. 6 


4 . 


1 


242 


9 


US-09-991-R76-9S9R 


Cpmipripp TCTQ An 

ocqucHLc ZDzo , i-\.p 




23 


41 


. 6 


4 . 


1 


242 


12 


US-09-993-R76-9S9R 


Cprrnonro QCTQ An 
otrL^UtrilL-c ZDZo , Ap 




24 


40 


. 8 


4 . 


o 


1295 


12 


US - 1 0 - 3 1 0 - 1 S 4 - 9 9 4 


Qprmpnpp O Q y| Ann 




25 


40 


. 2 


4 . 


0 


113306 


12 


US- 10 -999 -7 9ft - 1 007 

\JZD -X \J & ZJ £j 1 y O A. \J \J 1 


Cpcnipnpo 1 nm An 
OtrvJU.trXlL,fc; 1UU / , 




26 


39 


. 8 


3 . 


9 


431 


11 


US - 09-91 R-99S-S7R7 


Ot:L| U.trli.U ti D / o / t Ap 


c 


27 


38 


. 4 


3 . 


8 


6301 


13 


US -10-311-455-26 


Sp^m ipti r*£* 9^ a r^ir^ 1 

OCLjUCilLC ili o , ±\^J^J x 




28 


38 


.2 


3 . 


8 


1200 


13 


US- 10-027 -632 -961 93 S 

^>i-J X V W ill / U Jtj Z U J. /i J J 






29 


38 


.2 


3 . 


8 


1200 


14 


US-1 0-097-^39-9^1 93R 

U J XW \j £i I \J Zj J-i Z U X Zi J J 






30 


37 


. 8 


3 . 


7 


1457 


15 


US -10-0^4-968-9 


CJ ppfi "i on Q Annl i 
ocLjUtrlluc: i-ipp± x 




31 


37 


.8 


3 . 


7 


7178 


13 


US -09-873-367r-97ft 


Qpmipnpp 9 7ft A nn 
OCl|UCllLC Zi / O 7 J-ipp 




32 


37 


.6 


3 . 


7 


1267 


14 


tjc_i 0-001 - M43-4S 

w > X U W W X O M _) iJ 


CI f~ri i p-n o zl R AnnT 
Dcl4UcIlLc ^-3/ Appj. 


c 


33 


37 


. 6 


3 . 


7 


367377R 


13 


US -10-31 9 - R4 1-1 

X \J J 1Z 0*±1 X 


sequence x, Appii 




34 


37 


. 4 


3 . 


7 


2232 


15 


US-1 0-0ft7-4^4-4R 


CpmiDriPD /t Ann! 

ocquciiLc D , /\pp± 


c 


35 


37 


. 4 


3 . 


7 


4012 


9 


US - 09-876-8R9-33S 


oc^ucnL t; JjD, /\pp 


c 


36 


37 


.4 


3 . 


7 


4103 


13 


US -10-117-799-390 

W k-J A. \J -1 L/ / ill ill —) y \J 


CprmPTirp 7QH A nn 
ocq U. ell L, tr U , /ijjjj 


c 


37 


37 


. 4 


3 . 


7 


4103 


15 


US -10-037-970-390 


CI fzsr^n i on ") Q H A nn 

■DcC^UcIlL-fcr j;U, App 


c 


38 


37 


.4 


3 . 


7 


8577 


13 


US -10-311-455-1760 


QpmiPnpp 1 TCfl A n 
OC^UCllLC X / DU , jt\y) 


c 


39 


37 


.2 


3 . 


7 


869 


13 


US- 10- 027- 632 -261 978 


Sequence 261978, 


c 


40 


37 


.2 


3 . 


7 


869 


14 


US-10- 027-632 -261978 


Sequence 261978, 


c 


41 


37 


3 . 


6 


5413 


13 


US-10-311-455-538 


Sequence 538, App 


c 


42 


36 


. 6 


3 . 


6 


9367 


13 


US-10-311-455-944 


Sequence 944, App 




43 


36 


. 4 


3 . 


6 


461 


14 


US-10-079-623-143 


Sequence 143, App 




44 


36 


. 4 


3 . 


6 


2641 


12 


US-10-369-493-29299 


Sequence 2 9299, A 


c 


45 


36 


4 


3 . 


6 


6071 


13 


US-10-311-455-297 


Sequence 2 97, App 



ALIGNMENTS 



RESULT 1 

US-10-117-722-111 

; Sequence 111, Application US/10117722 

; Publication No. US20030219744A1 

; GENERAL INFORMATION: 

; APPLICANT: Tang, Y. Tom 



APPLICANT: Liu, Chenghua 
APPLICANT: Asundi, Vinod 
APPLICANT: Zhang, Jie 
APPLICANT: Drmanac, Rado j e T. 

TITLE OF INVENTION: No. US2 0 03 02 1 9744Alel Nucleic Acids and 
TITLE OF INVENTION: Polypeptides 
FILE REFERENCE: 784CIP2BCIP 

CURRENT APPLICATION NUMBER: US/10/117,722 
CURRENT FILING DATE : 2002-04-04 
PRIOR APPLICATION NUMBER: 09/620,312 
PRIOR FILING DATE: 2000-07-19 
PRIOR APPLICATION NUMBER: 09/552,317 
PRIOR FILING DATE: 2000-04-25 
PRIOR APPLICATION NUMBER: 09/488,725 
PRIOR FILING DATE: 2000-01-21 
NUMBER OF SEQ ID NOS : 1104 
SOFTWARE: pt_FL_genes Version 1.0 
SEQ ID NO 111 
LENGTH: 1421 
TYPE : DNA 

ORGANISM: Homo sapiens 
FEATURE : 
NAME/ KEY: CDS 
LOCATION: (217) . . (1230) 
US-10-117-722-111 

Query Match 100.0%; Score 1014; DB 13; Length 1421; 

Best Local Similarity 100.0%; Pred. No. 1.2e-281; 

Matches 1014; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 
Qy 1 ATGAAAAAAATGCCTTTGTTTAGTAAATCACACAA 60 

: 1 1 1 1 1 : 1 ! 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 . M ! 1 1 ; I i 1 1 1 ! 1 1 1 1 1 1 i 1 1 1 1 1 ! 1 1 1 

Db 217 ATGAAAAAAATGCCTTTGTTTAGTAAATCACACAAAAATCCAGCAGAAATTGTGAAAATC 276 

Qy 61 CTGAAAGACAATTTGG CCATTTTGGAAAAG CAAGACAAAAAGACAGACAAGG CTTCAGAA 12 0 

M 1 1 ! 1 1 iM 1 1 1 < 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 M M 1 1 1 1 1 1 1 1 ! II I 

Db 277 CTGAAAGACAATTTGG CCATTTTGGAAAAG CAAGACAAAAAGACAGACAAGG CTTCAGAA 336 

Qy 121 GAAGTGTCTAAATCA.CTGCAAGCAATGAAAGAAATTCTGTGTC 180 

I ; 1 1 M 1 1 M M M M 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 

Db 337 GAAGTGTCTAAATCACTGCAAGCAATGAAAGAAATT 3 96 

Qy 181 CCCCCAACAGAAGCAGTGGCTCAGCTAGCACAAGAACTCTAC^ 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 3 97 CCCCCAACAGAAGCAGTGGCTCAGCTAGCACAAGAACTCT^ 4 56 

Qy 241 ACACTGATAGCTGACCTGCAGCTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATA 3 00 

I I M 1 1 1 1 1 1 1 1 1 i I M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M I ! 1 1 1 1 1 1 1 1 1! ! 1 1 1 1 1 1 1 1 1 M 1 1 

Db 4 57 ACACTGATAGCTGACCTGCAGCTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATA 516 

Qy 301 TTTAACAACATCTTGAGAAGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGT 360 

' 1 1 1 1 1 , M 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 517 TTTAACAACATCTTGAGAAGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGT 576 



Qy 

Db 



361 
577 



GCTCATCCTCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTA 

1 1 1 1 1 1 1 1 1 II I M 1 1 1 1 1 1 1 1 1 i I i 1 1 1 1 1 M 1 1 1 1 ! 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 

GCTCATCCTCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTA 



420 
636 



Qy 


421 


CGTTGTGGGATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCIATCCTC 


480 






1 1 1 II 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
1 II M II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 II II 1 1 1 1 1 1 1 II 1 1 1 1 II 




Db 


637 


CGTTGTGGGATTATG CTGAGAGAATGTATT CGACATGAA C CACTTG C CAAAAT CAT C CTC 


696 


Qy 


481 


TTTTCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCT 


540 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 
1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


697 


TTTTCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCT 


756 


Qy 


541 


T(ZAGATGCCTTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGAC 


600 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 




Db 


757 


TCAGATGCCTTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGAC 


816 


Qy 


601 


TTCTTAGAACAAAATTACGACACTATTTTTGAAGACTATGAGAAATTG CTT CAGTCTGAG 


660 






i II 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 II i 1 1 1 Ml 1! 1 1 1 1 1 1 1! 1 1! 1 1 1 II Ml 1 1 M 




Db 


817 


TTCTTAGAACAAAATTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAG 


876 


Qy 


661 


AATTATGTTACTAAGAGACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCAC 


720 






1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


877 


AATTATGTTACTAAGAGACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCAC 


936 


Qy 


721 


AACTTTGCCATCATGACAAAGTATATCAGCAAGCCGGAGAACCTGAAACTCATGATGAAC 
1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 


780 


Db 


937 


1 1 1 i II 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 I I I 
AACTTTG CCAT CATGA CAAAGTATATCAG CAAGC CGGAGAACCTGAAACTCATGATGAAC 


996 


Qy 


781 


CTCCTTCGGGATAAAAGTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTT 


840 






MMMMMM MMMMMMMMMMMMMM MMMMMMIMMMI 




Db 


997 


CTCCTTCGGGATAAAAGTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTT 


1056 


Qy 


841 


GTGGCCAGTCCTCACAAAACACAGCCTATTGTGGAGATCCTGTTAAAAAATCAGCCCAAA 


900 






II II 1 1 1 1 1! 1 ! 1 1 1 1 II 1 II 1 1 1 1 II : 1 1 1 Ml li 1 1 1 1 1 1 II 1 II 1 1 1 II II 1 1 1 1 1 




Db 


1057 


GTGGCCAGTCCTCACAAAACACAGCCTATTGTGGAGATCCTGTTAAAAAATCAGCCCAAA 


1116 



Qy 901 CT CATTGAGTTT CTGAGCAG CTTCCAAAAAGAAAGGACGGATGATGAGCAGTTCGCTGAC 960 

I :M 1 1 :l 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 Ml II III 1 1 1 1! I II 1 1 1 1 1 1 1 h II II 1 1 1 1 1 

Db 1117 CTCATTGAGTTTCTGAGCA.GCTTCCAAAAAGAAAGGACGGATGATGAGCAGTTCGCTGAC 1176 

Qy 9 61 GAGAAGAACTACTTGATTAAACAGATC CGAGACTTGAAGAAAACGG CCCCTTGA 1014 

IMM 1 1 1 1 II 1 1 1 1 1 1 1 1 II: 1 1 1 II 1 1 M M I M II 1 1 1 1 II I M 1 1 1 II ! 

Db 1177 GAGAAGAACTACTTGATTAAACAGATCCGAGACTTGAAGAAAACGG CCCCTTGA 123 0 



RESULT 2 

US-10-037-270-111 

Sequence 111, Application US/10037270 
Publication No. US20030104529A1 
GENERAL INFORMATION: 



APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 



Tang , Y . Tom 
Liu, Chenghua 
Asundi , Vinod 
Zhang, Jie 
Ren, Feiyan 
Chen , Ru i - hong 
Zhao, Qing A. 
Wehrman, Tom 
Xue, Aidong J . 
Yang, Yonghong 



APPLICANT : Wang, Jian-Rui 
APPLICANT: Zhou, Ping 
APPLICANT: Ma, Yunqing 
APPLICANT: Wang, Dunrui 
APPLICANT: Wang, Zhiwei 
APPLICANT: Tillinghast, John 
APPLICANT: Drmanac, Radoj e T. 

TITLE OF INVENTION: No. US2 003 010452 9Alel Nucleic Acids and 
TITLE OF INVENTION: Polypeptides 
FILE REFERENCE: 784CIP2B 

CURRENT APPLICATION NUMBER: US/10/037,270 
CURRENT FILING DATE: 2002-01-04 
PRIOR APPLICATION NUMBER: 09/552,317 
PRIOR FILING DATE: 2000-04-25 
PRIOR APPLICATION NUMBER: 09/488,725 
PRIOR FILING DATE: 2000-01-21 
NUMBER OF SEQ ID NOS : 1104 
SOFTWARE: pt_FL_genes Version 1.0 
SEQ ID NO 111 
LENGTH: 1421 
TYPE: DNA 

ORGANISM: Homo sapiens 
FEATURE : 
NAME/KEY: CDS 
LOCATION: (217) . . (1230) 
US-10-037-270-111 

Query Match 100.0%; Score 1014; DB 15; Length 1421; 

Best Local Similarity 100.0%; Pred. No. 1.2e-281; 

Matches 1014; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

ATGAAAAAAATGCCTTTGTTTAGTAAATCACACAAAAATCC^GCAGAAATTGTGAAAATC 6 0 

IIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIMIIIIIMIIIII 

ATGAAAAAAATGCCTTTGTTTAGTAAATCACACAAAAATCC^GCAGAAATTGTGAAAATC 276 
CTGAAAGACAATTTGGCCATTTTGGAAAAGCAAGACAAAAAGAC 12 0 

IMIIIIIIMIIIIIIIIIIIIIIIIIIIIIilllillllMIIIIIIIIIIIIIIIII 

CTGAAAGACAATTTGGCCATTTTGGAAAAGC^GACAAAAAGACAGACAAGGCTTCAG^ 33 6 
GAAGTGT CTAAAT CACTG CAAGCAATGAAAGAAATTCTGTGTGGTACAAACGAGAAAGAA 18 0 

I i 1 1 II I ! N 1 1 1 1 h 1 1 , 1 1 1 1 1 1 i , 1 1 , II 1 1 1 1 1 ! 1 1 1 1, 1 1 M I < I ! 1 1 1 1 M I ! 

GAAGTGTCTAAAT CACTG CAAG CAATGAAAGAAATTCTGTGTGGTACAAACGAGAAAGAA 396 
CCCCCAACAGAAGCAGTGGCTCAGCTAGCACAAGAACTCTACAGC^ 24 0 

1 1 1 1 1! 1 1 1 1 II 1 1 1 II M h 1 1 1 II II 1 1 1: 1 1 1 M I II I M Ml II Ml 1 1 1 II 1 1 

CCCCCAACAGAAG(^GTGGCT(^GCTAGCACAAGAACTCTACAGCAGTGGCCTGCTAGTG 4 56 
ACACTGATAG CTGACCTG CAG CTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATA 3 00 

1 1 1 1 1 II 1 1 1 ; 1 1 1 1 1 ! 1 1 ! M 1 1 1 1 i :l I M 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 ! 1 1 1 

ACACTGATAGCTGACCTGCAGCTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATA 516 
TTTAACAACATCTTGAGAAGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGT 360 

I Ml IM 1 1 III Ml II I III 1 1 1 1 1 II 1 1 II II I II Ml 1 1 1 II I II III II I II 1 1 

TTTAACAACATCTTGAGAAGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGT 57 6 
GCTCATCCTCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTA 42 0 

1 1 ! 1 1 1 M 1 1 i M 1 1 1 1 1 1 1 , 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 i 1 1 



Qy 


i 


Db 


217 


Qy 


61 


Db 


277 


Qy 


121 


Db 


337 


Qy 


181 


Db 


397 


Qy 


241 


Db 


457 


Qy 


301 


Db 


517 


Qy 


361 



Db 


577 


GCT(^TCCT(^TATCCTGTTTATGCTCCTC7\AAGGATATGAAGCCCCACAGATTGCCTTA 


636 


Qy 


421 


CGTTGTGGGATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTC 

1 1 1 1 1 1 1 1 II 1 1 1 M 1 II Ml 1 1 1 II 1 1 1 1 II 1 1 1 1 1 1 II 1 1! 1 1 1 1! 1 1 1 1! 1 II 1 1 1 1 

CGTTGTGGGATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTC 


480 


Db 


637 


696 


Qy 


481 


TTTTCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCT 

1 MM 1 1 Ml II 1 II 1 1 1 1 Ml II II 1 II 1 II 1 II 1 II 1 1 1 II 1 II IN 1 1 1 1 1 1| 1 1 1 1 

TTTTCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCT 


540 


Db 


697 


756 


Qy 


541 


T CAGATG CCTTTGCTACTTT CAAGGATTTACTAACCAGACATAAAGTGTTGGTAG CAGAC 

MMMMMMMMMMMMMMMMMMMMMMMMMMMMMM 

TCAGATGCCTTTGCTACTTTCAAGGATTTACTAACC^GACATAAAGTGTTGGTAGCAGAC 


600 


Db 


757 


816 


Qy 
Db 


601 
817 


TTCTTAGAACAAAATTACGACACTATTTTTGAAGACTATGAGAAATTG CTT CAGTCTGAG 

Ml 1 1 1 1 1 II II MM II III 1 1 1 II 1 1 1 1 II 1 1 1 II 1 Ml III 1 1 1! 1 1 1 II 1 MIM 1 

TTCTTAGAACAAAATTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAG 


660 
876 


Qy 


661 


AATTATGTTACTAAGAGACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCAC 

MMMMMMMMMMMMMMMMMMMMMMMMMMMMMM 

AATTATGTTACTAAGAGACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCAC 


720 


Db 


877 


936 


Qy 


721 


AACTTTGCCATCATGACAAAGTATAT^GO^GCCGGAGAACCTGAAACTCATGATGAAC 

M II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 M 1 M 1 M 1 1 1 M 1 M II 1 M M II 1 1 II 1 

AACTTTGC CATCATGACAAAGTATAT CAG CAAG CCGGAGAAC CTGAAACTCATGATGAAC 


780 


Db 


937 


996 


Qy 


781 


CTCCTTCGGGATAAAAGTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTT 

MMMMMMMMMMMMMMMMMMMMMMMMMMMMMM 

CTCCTTCGGGATAAAAGTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTT 


840 


Db 


997 


1056 


Qy 


841 


GTGGCCAGTCCTCACAAAACAC^GCCTATTGTGGAGATCCTGTTAAAAAATCAGCCa^ 

MMMMMMMMMMMMMMMMMMMMMMMMMMMMMM 

GTGGCCAGTCCTCACAAAACACAGCCTATTGTGGAGATCCTG 


900 


Db 


1057 


1116 


Qy 
Db 


901 
1117 


CTCATTGAGTTTCTGAGCA.GCTTCC 

MMMMMMMMMMMMMMMMMMMMMMMMMMMMMM 

CTCATTGAGTTTCTGAGCAGCTTCCAAAAAGAAAGGACGGATGATGAGCAGTTCGCTGAC 


y o u 
1176 


Qy 
Db 


961 
1177 


GAGAAGAACTACTTGATTAAACAGATCCGAGACTTGAAGAAAACGGCCCCTTGA 1014 

MMMMMMMMMMMMM MMMMMMMMMMMMMM 

GAGAAGAACTA CTTGATTAAACAGAT CCGAGACTTGAAGAAAACGG CCC CTTGA 123 0 





RESULT 3 
US-10-025-730-2 

; Sequence 2, Application US/10025730 

; Publication No. US20030045466A1 

; GENERAL INFORMATION: 

; APPLICANT: Tang, Y. Tom 

; APPLICANT: Guegler, Karl J. 

; APPLICANT: Corley, Neil C. 

; APPLICANT: Gorgone, Gina A. 

; TITLE OF INVENTION: CALCIUM BINDING PROTEIN 

FILE REFERENCE: PF-063 5 US 
; CURRENT APPLICATION NUMBER: US/l 0/ 025 , 73 0 

CURRENT FILING DATE: 2001-12-18 
; PRIOR APPLICATION NUMBER: US/09/190 , 965 



PRIOR FILING DATE: 1998-11-13 
NUMBER OF SEQ ID NOS : 5 
SOFTWARE: PERL Program 
SEQ ID NO 2 
LENGTH: 134 4 
TYPE : DNA 

ORGANISM: Homo sapiens 

FEATURE : - 

OTHER INFORMATION: 3734805 
US-10-025-730-2 

Query Match 99.7%; Score 1010.8; DB 15; Length 1344; 

Best Local Similarity 99.8%; Pred. No. le-280; 

Matches 1012; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 

ATGAAAAAAATGCCTTTGTTTAGTAAATCACACAAAAATCCAGC^ 6 0 

1 1 1 1 1 1 1 1 1 1 1 iii 1 1 1 1 1 ii 1 1 ii ii ; ii mi i ii 1 1 1 1 1 iii ii ii i ii 1 1 1 1 1 in 

ATGAAAAAAATGCCTTTGTTTAGTAAATCACACAAAAATC 183 
CTGAAAGACAATTTGGCCATTTTGGAAAAGCAAGAC^^ 12 0 

I I I I I I I I I I I I I I I I : I . M I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I 

CTGAAAGACAATTTGGCCATTTTGGAAAAGCAAGACAA 243 
GAAGTGTCTAAATCACTGCAAGO^TGAAAGAAATTCTGTGTGGTACAAACGAGAAAGAA 18 0 

1 1 1 1 M ; I IN 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 M i II 1 1 ! 1 1 :l 

GAAGTGTCTAAATCACTGCAAG(JAATGAAAGAAATTCTGTGTGGTAC^^ 3 03 

CCCCCAAC^GAAGC^GTGGCTCAGCTAGCACAAGAACTCTACAGC^GTGGCCTGCTAGTG 24 0 

Mill 1 : 1 1 1 1 1 1 1 1 1 I ' .1 1 ! 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 ; II 1 1 1 III 

CCCCCGACAGAAGCAGTGGCTCAGCTAGCACAAGAACTCTACAGCAGTGGCCTGCTGGTG 3 63 
ACACTGATAG CTGACCTGCAG CTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATA 3 00 

1 1 1 1 M i i II 1 1 1 1 1 M 1 1 1 1 : 1 i 1 1 1 1 1 1 1 1 ! I M 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 

ACACTGATAG CTGAC CTG CAG CTGATAGACTTTGAGGGAAAAAAAGATGTGA CCCAGATA 423 

TTTAACAACATCTTGAGAAGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGT 36 0 

I M I I M I I I I I I I I I I I I II I I I : I M I I I I I I I I I I I I I II I I I I I I I I I I I II I I I 

TTTAACAACATCTTGAGAAGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGT 483 

GCTC^TCCTCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTA 42 0 

1 1 1 m 1 1 1 1 1 1 , 1 1 1 1 1 1 1 1 ii 1 1 1 ; i h 1 1 ;i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 

GCTCATCCTCATATCCTGTTTATGCTCCTC^AAGGATATGAAGCCCCACAGATTGCCTTA 543 
CGTTGTGGGATTATGCTGAGAGAATGTATTCGAC^TGAACC^CTTGCCAAAATCATCCTC 4 8 0 

MM I III I II 1 1 II III II III I Mill Ii III I III I II I II MM 1 1 II I II II III 

CGTTGTGGGATTATGCTGAGAGAATGTATTCGA(^TGAACQ\CTTGCCAAAATCATCCTC 603 

TTTTCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCT 54 0 

I I II I II I M II II I I I I I I I I I I I I I I I I I I I I I II M I II M I I I I I I I I M I M II I 

TTTTCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCT 663 

TCAGATGCCTTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGAC 600 

I I I I II II II I II I II I II II II I I I I I I I I I I I I II II II I I I I I II II II II I I II I I 

TCAGATGCCTTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGAC 723 

TTCTTAGAACAAAATTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAG 660 

I II II I II I II II II I II 1 1 1 1 II M 1 1 1 1 II 1 1 1 1 1 II I II 1 1 II II 1 1 1 II I II II 1 1 



Qy 


i 


Db 


124 


Qy 


61 


Db 


184 


Qy 


121 


Db 


244 


Qy 


181 


Db 


304 


Qy 


241 


Db 


364 


Qy 


301 


Db 


424 


Qy 


361 


Db 


484 


Qy 


421 


Db 


544 


Qy 


481 


Db 


604 


Qy 


541 


Db 


664 


Qy 


601 



Db 


724 


Qy 


661 


Db 


784 


Qy 


721 


Db 


844 


Qy 


781 


Db 


904 


Qy 


841 


Db 


964 


yy 




Db 


1024 


Qy 


961 


Db 


1084 



724 TTCTTAGAACAAAATTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAG 783 
AATTATGTTACTAAGAGACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCAC 72 0 

1 1 1 1 M ! 1 1 1 1 M 1 1 1 II : 1 1 1 1 1 Ml 1 1 1 1 1 1 II 1 1 1 1 ! 1 1 1 1 1 1 1 1 II M I II I il 1 1 

AATTATGTTACTAAGAGACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCAC 843 
AACTTTGCCATCATGACAAAGTATATC^GCAAGCCGGAGAACCTGAAACTCATGATGAAC 78 0 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 li 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | | | | | | | | 

AACTTTGCCATCATGACAAAGTATATCAGCAAGCCGGAGAACCTGAAACTCATGATGAAC 903 
CTCCTTCGGGATAAAAGTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTT 84 0 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M ! 1 1 1 1 1 1 M 1 1 M 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

CTCCTTCGGGATAAAAGTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTT 963 
GTGGC(^GTCCTC^CAAAACACAGCCTATTGTGGAGATCCTGTTAAAAAATCAGCCCAAA 900 

1 1 1 1 1 1 1 1 1 , 1 1 1 1 1 1! Ml M! M 1 1 1 1 ,M 1 1 1 1 1 1 1 1 1 1 1 1 1 i h MM 1 1 II II I 

GTGGCC^GTCCTCACAAAACACAGCCTATTGTGGAGATCCTGTTAAAAAATQ\GCCCAAA 1023 
CTCATTGAGTTTCTGAGCAGCTTCCAAAAAGAAAGGACGGATGATGAGCAGTTCGCTGAC 960 

1 1 1 ' IM 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 M MM i !l I II M I M 1 1 1 1 1 MM I Ml I! M 1 1 

CTCATTGAGTTTCTGAGCAGCTTCCAAAAAGAAAGGACGGATGATGAGCAGTTCGCTGAC 1083 
GAGAAGAACTACTTGATTAAACAGATCCGAGACTTGAAGAAAACGGCCCCTTGA 1014 

1 1 M M MM 1 1 M I II 1 1 MM 1 1 1 1 M 1 1 1 1 M I MM M I MM 1 1 1 1 Ml I 

GAGAAGAACTACTTGATTAAACAGATCCGAGACTTGAAGAAAACGGCCCCTTGA 113 7 
RESULT 4 

US-09-918-995-5343 

Sequence 5343, Application US/09918995 
Publication No. US20030073623A1 
GENERAL INFORMATION: 
APPLICANT: Hyseq, Inc. 

TITLE OF INVENTION: NOVEL NUCLEIC ACID SEQUENCES OBTAINED 
TITLE OF INVENTION: FROM VARIOUS cDNA LIBRARIES 
FILE REFERENCE: 20411-756 

CURRENT APPLICATION NUMBER: US/09/918,995 
CURRENT FILING DATE: 2001-07-30 
PRIOR APPLICATION NUMBER: US/ 09/23 5 , 076 
PRIOR FILING DATE: 1999-01-20 
NUMBER OF SEQ ID NOS : 38054 
SOFTWARE: FastSEQ for Windows Version 3.0 
SEQ ID NO 5343 
LENGTH: 4 75 
TYPE: DNA 

ORGANISM: Homo sapiens 
FEATURE : 

NAME/KEY: misc_f eature 
LOCATION: (1) . . . (475) 
OTHER INFORMATION: n = A,T,C or G 
US-09-918-995-5343 



Query Match 39.3%; Score 398; DB 11; Length 475; 

Best Local Similarity 100.0%; Pred. No. 3.5e-104; 

Matches 398; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 617 ACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAGAATTATGTTACTAAGA 676 



Db 


1 


Qy 


677 


Db 


61 


Qy 


737 


Db 


121 


Qy 


797 


Db 


18] 


Qy 


857 


Db 


241 


Qy 


917 


Db 


301 


Qy 


977 


Db 


361 



M 1 1 i I II 1 1 1 i I! Ml 1 1 ! 1 1 Ml : 1 1 il! 1 1 1 1! 1 1 1 1! 1 1 1 1 1 1 1 1 1 1 1 M I M III 

ACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAGAATTATGTTACTAAGA 6 0 
GACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCACAACTTTGCCATCATGA 736 

1 1 1 1 i 1 1 i 1 1 II 1 1 1 > 1 1 1 M 1 1 1 1 1 1 1 M II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 II 1 1 

GACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCACAACTTTGCCATCATGA 12 0 
CAAAGTATATCAGCAAGCCGGAGAACCTGAAACTCATGATGAACCTCCTTCGGGATAAAA 7 96 

M 1 1 II M 1 1 1 III ! I Mil M I II I III MM Mill I llllll I III 1 1 1 1 1 1 III I 

CAAAGTATATCAGCAAGCCGGAGAACCTGAAACTCATGATGAACCTCCTTCGGGATAAAA 180 
GTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTTGTGGCCAGTCCTCACA 8 56 

Ml 1 1 1 II! I II 1 1 III II 1 1 1 MM I III I II 1 1 II I MM I II M III I II 1 1 II III 

GTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTTGTGGCCAGTCCTCACA 24 0 
AAACACAGCCTATTGTGGAGATCCTGTTAAAAAATCAGCCCAAACTCATTGAGTTTCTGA 916 

II II I II II II II MUM II I II 1 1 II MM I II II I II III 1 1 1 II MIIIMI III 

AAACACAGCCTATTGTGGAGATCCTGTTAAAAAATCAGCCCAAACTCATTGAGTTTCTGA 3 00 
GCAG CTTCCAAAAAGAAAGGACGGATGATGAGCAGTTCG CTGACGAGAAGAACTACTTGA 976 

1 1 1 1 1 I h i I 1 1 1 1 II 1 1 1 1 1 1 1! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 

G CAG CTT CCAAAAAGAAAGGACGGATGATGAG CAGTTCGCTGACGAGAAGAACTACTTGA 360 
TTAAACAGATCCGAGACTTGAAGAAAACGG C C CCTTGA 1014 

1 1 i M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 

TTAAACAGATCCGAGACTTGAAGAAAACGGCCCCTTGA 3 98 
RESULT 5 

US-09-910-943-318 

Sequence 318, Application US/09910943 
Patent No. US20020081610A1 
GENERAL INFORMATION: 
APPLICANT: Hemmat i -Brivanlou , Ali 
APPLICANT: Altman, Curtis 

TITLE OF INVENTION: Assays and Materials for Embryonic Gene Expression 
FILE REFERENCE : 7529/1G148US1 
CURRENT APPLICATION NUMBER: US/09/910 , 943 
CURRENT FILING DATE: 2001-07-23 
NUMBER OF SEQ ID NOS : 742 
SOFTWARE: Patentln version 3.1 
SEQ ID NO 318 
LENGTH: 690 
TYPE: DNA 

ORGANISM: Xenopus laevis 
FEATURE : 

NAME/ KEY : misc_f eature 
LOCATION: (1) . . (690) 

OTHER INFORMATION: n may be a or g or c or t/u 
US-09-910-943-318 

Query Match 28.5%; Score 288.8; DB 9; Length 690; 

Best Local Similarity 80.5%; Pred. No. 1.4e-72; 

Matches 338; Conservative 0; Mismatches 82; Indels 0; Gaps 0; 
Qy 595 GCAGACTTCTTAGAACAAAATTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAG 654 

Mill II 1 1 1 1 lllll || IN | MIIIMI II I Mill 



Db 


69 


Qy. 


655 


Db 


129 


Qy 


715 


Db 


189 


Qy 


775 


Db 


249 


Qy 


835 


Db 


309 


uy 


one 

o y d 


Db 


369 


Qy 


955 


Db 


429 



69 G CAGAATTT CTAGAG CAAAATTACGACAGAATATTTAATGACTATGAAAAGCTTCTT CAC 128 
T CTGAGAATTATGTTACTAAGAGACAGTCTTTAAAG CTG CTAGGGGAGCTGAT CCTGGAC 714 

IIIIIMI Mill i I lllllllllll I MINIM II MIIIIIIMIIIII 

TCTGAGAACTATGTGACGAAGAGACAGTCCCTTAAGCTGCTGGGCGAGCTGATCCTGGAC 188 
CGTCACAACTTTGCCATCATGACAAAGTATATCAGCAAGCCGGAGAACCTGAAACTCATG 774 

II MMIIMI MM Mill II II II IIIIIMI II II Mill MINI 



ATGAACCTCCTTCGGGATAAAAGTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAG 834 

Mill II II II Mill II II Mill 1 1 M ! II 1 1 1 1 II Mill MUM 

ATGAATCTGCTCCGTGATAAGAGCCCAAACATTCAGTTTGAAGCATTCCATGTGTTTAAG 308 
GTGTTTGTGGCCAGTCCTCACAAAA(^(^GCCTATTGTGGAGATCCTGTTAAAAAATCAG 8 94 

II II I Ml 1 1 1 1 1 M i 1 1 1 1 1 II Mill 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 

GTGTTTGTAGCAAATCCAAACAAAACACAGCCCATCGTGGATATCCTGTTAAAAAACCAA 368 



MM I II II II lllllllllll II II M I II Mill II MINI 

.CCAAGTTAATCGACTTCCTGAGCAGCTTTCAGAAGGATCGAACAGATGACGAACAGTTC 
\ CTGACGAGAAGAACTACTTGATTAAACAGAT C CGAGACTTGAAGAAAACGG CCC CTTGA 

I 1 1 1 , 1 1 1 1 M I : i 1 1 1 1 1 i IIIIIMI IIIIIMI II II I I II III 



RESULT 6 

US-09-867-701-5263 

/ Sequence 5263, Application US/09867701 

; Patent No. US20020132237A1 

; GENERAL INFORMATION: 

; APPLICANT: Aglate, Paul A. 

; APPLICANT: Jones, Robert 

; APPLICANT: Harlocker, Susan L. 

; TITLE OF INVENTION: COMPOSITIONS AND METHODS FOR THE THERAPY 
; TITLE OF INVENTION: AND DIAGNOSIS OF OVARIAN CANCER 
FILE REFERENCE: 210121.497 

CURRENT APPLICATION NUMBER: US/09/8 67,701 
CURRENT FILING DATE: 2001-05-29 
; NUMBER OF SEQ ID NOS : 10912 
; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 5263 
LENGTH: 43 5 
TYPE : DNA 

ORGANISM: Homo sapien 
US-09-867-701-5263 



Query Match 24.3%; Score 246.4; DB 10; Length 435; 

Best Local Similarity 77.6%; Pred. No. 1.8e-60; 

Matches 298; Conservative 0; Mismatches 86; Indels 0; Gaps 0; 

Qy 62 6 TTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAGAATTATGTTACTAAGAGACAGTCTT 685 

I I I I M I I I f I I I I M Mill II II IIIIIMI M II IIIIIMI 

Db 41 TTTTC^GTGAATATGAGAAGTTACTTCATTOVGAAAATTATGTGAOU\AAAGACAGTCA.C 100 



Qy 68 6 TAAAGCTGCTAGGGGAGCTGATCCTGGACCGTC^C^CTTTGCCATCATGACAAAGTATA 745 



I Mill M II II II I II II I llllllll I II IIIIIMI II I 

rGAAGCTTCTCGGTGAA.CTACTACTAGATAGACACAACTTCACAATTATGACAAAATACA 160 



Db 


101 


TGAAGCTTCTCGGTGAACTACTACTAGATAGACA 


160 


Qy 


746 


TCAGCAAGCCGGAGAACCTGAAACTCATGATGAACCTCCTTCGGGATAAAAGTCCO^CA 

INI II II IIIIIMI III 1 IIIIIIIIMI II II II 1 1 1 1 1 II MMI 

TC^GTAAACCTGAGAACCTCAAATTAATGATGAACCTGCTGCGAGACAAAAGTCGCAACA 


805 


Db 


161 


220 


Qy 


806 


TC(^GTTTGAAGCCTTT(^TGTTTTTAAGGTGTTTGTGGCCAGTCCTCACAAAACACAGC 

MIIIIMM llllllll MMMMMMIMM MM MM MM II MM 

TCCAGTTTGAGGCCTTTCACGTTTTTAAGGTGTTTGTAGCCAATCCTAACAAGACGCAGC 


865 


Db 


221 


280 


Qy 


866 


CTATTGTGGAGATCCTGTTAAAAAATCAGCCCAAACTCATTGAGTTTCTGAGCAGCTTCC 

Ml III Mill 1 II II III Illlllllli Mill II MM II 1 

CCATCCTAGACATCCTCCTCAAGAACCAGGCCAAACTCATAGA 


925 


Db 


281 


340 


Qy 


926 


AAAAAGAAAGGACGGATGATGAG CAGTTCG CTGACGAGAAGAACTACTTGATTAAACAGA 

1 M II llllllll 11,11 IIIIIMI III M lllllllll 

AGAACGACAGGACGGAGGATGAGCAGTTTAACGACGAGAAGACCTATTTAGTTAAACAGA 


985 


Db 


341 


400 


Qy 


986 


TC CGAGACTTGAAGAAAACGG CCC 100 9 

MM II IIIIIII Mill 

TCCGGGATTTGAAGAGACCCGCTC 424 




Db 


401 





RESULT 7 

US-09-867-701-5899 

; Sequence 5899, Application US/09867701 
; Patent No. US20020132237A1 
; GENERAL INFORMATION: 
; APPLICANT: Aglate, Paul A. 
; APPLICANT: Jones, Robert 
•; APPLICANT: Harlocker, Susan L. 
; TITLE OF INVENTION: COMPOSITIONS AND METHODS FOR THE THERAPY 
; TITLE OF INVENTION: AND DIAGNOSIS OF OVARIAN CANCER 
FILE REFERENCE: 210121.497 

CURRENT APPLICATION NUMBER: US/0 9/8 67,701 
CURRENT FILING DATE: 2001-05-29 
; NUMBER OF SEQ ID NOS : 10912 
; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 5899 
LENGTH: 44 7 
TYPE : DNA 

ORGANISM: Homo sapien 
US-09-867-701-5899 

Query Match 24.1%; Score 244.8; DB 10; Length 447; 

Best Local Similarity 77.3%; Pred. No. 5.3e-60; 

Matches 297; Conservative 0; Mismatches 87; Indels 0; Gaps 0; 
Qy 62 6 TTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAGAATTATGTTACTAAGAGACAGTCTT 685 

MM II IIIIIMI II Mill II II llllllll M II llllllll 

Db 41 TTTTCAGTGAATATGAGAAGTTACTTCATTCAGAAAA 100 

Qy 68 6 TAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCACAACTTTGC(^TCATGACAAAGTATA 745 

I Mill II II II II Ml II I IIIIIMI I II llllllll M I 

Db 101 TGAAGCTTCTCGGTGAACTACTACTAGATAGACA(^ 160 



Qy 74 6 TCAGC^GCCGGAGAACCTGAAACT(^TGATGAACCTCCTTCGGGATAAAAGTCCCAACA 805 

1 1 1 1 1 1 M llllllll III I Illllllll II II || lllllll INN 

Db 161 TCAGTAAACCTGAGAACCTCAAATTAATGATGAACCTGCTGCGAGACAAAAGTCGCAACA 22 0 

Qy 8 06 TCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTTGTGGCCAGTCCTCACAAAACACAGC 865 

MIIIIIIM llllllll Illllllllllllllll MM MM MM II MM 

Db 221 TCCAGTTTGAGGCCTTTCACGTTTTTAAGGTGTTTGTAGCCAATCCTAACAAGACGCAGC 28 0 

Qy 8 66 CTATTGTGGAGATCCTGTTAAAAAATCAGCCCAAACTCATTGAGTTTCTGAGCAGCTTCC 925 

I II Ml MMI I II II III MIIIIIIM Mill II MM II I 

Db 281 CCATCCTAGACATCCTCCTCAAGAACCAGGCCAAACTCATAGAGTTCCTCAGCAAGTTTC 34 0 

Qy 92 6 AAAAAGAAAGGACGGATGATGAGCAGTTCGCTGACGAGAAGAACTACTTGATTAAACAGA 985 

I M M MMMM MMMMMI MIIIIIIM III II Illllllll 

Db 341 AGAACGACAGGACGGAGGATGAGCAGTTTAACGACGAGAAGACCTATTTAGTTAAACAGA 400 

Qy 986 T CCGAGACTTGAAGAAAA CGGCCC 1009 

II I II lllllll I I II I 

Db 4 01 TCAGGGATTTGAAGAGACCAGCTC 4 24 

RESULT 8 

US-09-867-701-4953 

Sequence 4953, Application US/09867701 
Patent No. US20020132237A1 
GENERAL INFORMATION: 
APPLICANT: Aglate, Paul A. 
APPLICANT: Jones , Robert 
APPLICANT: Harlocker, Susan L. 

TITLE OF INVENTION: COMPOSITIONS AND METHODS FOR THE THERAPY 
TITLE OF INVENTION: AND DIAGNOSIS OF OVARIAN CANCER 
FILE REFERENCE: 210121.497 

CURRENT APPLICATION NUMBER: US/ 09/867 , 70 1 
CURRENT FILING DATE: 2001-05-29 
NUMBER OF SEQ ID NOS : 10912 
SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 4953 
LENGTH: 45 0 
TYPE : DNA 

ORGANISM: Homo sapien 
US-09-867-701-4953 

Query Match 24.1%; Score 244.8; DB 10; Length 450; 

Best Local Similarity 77.3%; Pred. No. 5.3e-60; 

Matches 297; Conservative 0; Mismatches 87; Indels 0; Gaps 0; 
Qy 626 TTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAGAATTATGTTACTAAGAGACAGTCTT 685 

MM M llllllll II Mill II II MMMM M II MMMM 

Db 27 TTTTCAGTGAATATGAGAAGTTACTTCATTCAGAAAATTATGTGACAAAAAGACAG 8 6 

Qy 686 TAAAG CTGCTAGGGGAG CTGAT C CTGGACCGT CACAACTTTGCCATCATGACAAAGTATA 74 5 

I Mill II M M II I II II I MMMM I II MMMM M I 

D ^ 87 TGAAGCTTCTCGGTGAACTACTACTAGATAGACACAACTTC^CAATTATGACAAAATACA 146 

Qy 74 6 TCAGCAAGCCGGAGAACCTGAAACTCATGATGAACCTCCTTCGGGATAAAAGTCCCAACA 8 05 

1 1 1 1 II II 1 1 f 1 1 1 1 1 III I lllllllllll II II II MMMI Mill 

Db 14 7 TCAGTAAACCTGAGAACCTCAAATTAATGATGAACCTGCTGCGAGACAAAAGTCGCAACA 2 06 



Qy 8 06 TCCAGTTTGAAGCCTTT(^TGTTTTTAAGGTGTTTGTGGC(^GTCCTCACAAAAC^CAGC 865 

Illlllllil llllllll lllllllllllllllll MM MM MM II MM 

Db 2 07 TCCAGTTTGAGGCCTTTCACGTTTTTAAGGTGTTTGTAGCCAATCCTAACAAGACGCAGC 266 

Qy 866 CTATTGTGGAGATCCTGTTAAAAAATCAGCCCAAACTCATTGAGTTTCTGAGCAGCTTCC 925 

Ml I M Mill I II II III Illlllllil lllll II I I I I II I 

Db 267 CCATCCTAGACATCCTCCTCAAGAACC^ 326 

Qy 926 AAAAAGAAAGGACGGATGATGAG CAGTTCG CTGACGAGAAGAACTACTTGATTAAACAGA 985 

I I I M MIMIII lllllllllll Illlllllil III II lllllllll 

Db 327 AGAACGACAGGACGGAGGATGAGCAGTTTAACGACGAGAAGACCTATTTAGTTAAACAGA 3 86 

Qy 986 TCCGAGACTTGAAGAAAACGGCCC 1009 

II I II lllllll lllll 

Db 3 87 TCAGGGATTTGAAGAGACCAGCTC 410 



RESULT 9 

US-09-910-943-35 

; Sequence 35, Application US/09910943 

; Patent No. US20020081610A1 

; GENERAL INFORMATION: 

; APPLICANT: Hemmat i -Brivanlou, Ali 

; APPLICANT: Alt man, Curtis 

; TITLE OF INVENTION: Assays and Materials for Embryonic Gene Expression 
; FILE REFERENCE: 7529/1G148US1 

CURRENT APPLICATION NUMBER: US/09/910 , 943 
CURRENT FILING DATE: 2001-07-23 
; NUMBER OF SEQ ID NOS : 742 
; SOFTWARE: Patentln version 3.1 
; SEQ ID NO 35 
LENGTH: 762 
TYPE : DNA 

ORGANISM: Xenopus laevis 
FEATURE : 

NAME/KEY: misc_feature 
LOCATION: (1) . . (762) 

OTHER INFORMATION: n may be a or g or c or t/u 
US-09-910-943-35 



Query Match 20.8%; Score 210.8; DB 9; Length 762; 

Best Local Similarity 78.6%; Pred. No. 4.8e-50; 

Matches 287; Conservative 0; Mismatches 75; Indels 3; Gaps 3; 
Qy 1 ATGAAAAAAATGCCTTTGTTTAGTAAATCACACAAAAATCCAGC 60 

Mill llllllll lllll II 1 1 lllll llllllll II II lllll 1 1 1 1 

Db 397 ATGAAGAAAATGCCATTGTTCAGCAAGT(^CATAAAAATCCGGCTGAGATTGTTAAAACT 4 56 

Qy 61 CTGAAAGACAATTTGG CCATTTTGGAAAAG CAAGACAAAAAGACAGACAAGG CTT CAGAA 12 0 

HIM HIM Mill I llllll Ml llllllll M II Mill II III 

Db 4 57 CTGAAGGACAACATGGCCCTGCTGGAAAGGCAGGACAAAAAAACTGAAAAGGCCTCTGAA 516 

Qy 121 GAAGTGTCTAAATCACTG CAAG CAATGAAAGAAATTCTGTGTGGTACAAACGAGAAAGAA 18 0 

M 1 1 M II II 1 1 II II lllll I Mill Ml lllllll Ml II llllll 

Db 517 GAAGTGTCTAAATCTCTTCAAGCTACAAAAGAGATTTTGTGTGGGACAGGGGACAAAGAA 576 



Qy 


181 


Db 


577 


Qy 


241 


Db 


637 


Qy 


O A A 


Db 


697 


Qy 


360 


Db 


755 



CCCCCAACAGAAGCAGTGGCTCAGCTAGCACAAGAACTCTACAGCAGTGGCCTGCTAGTG 240 

II I Mill I 1 1 E 1 1 1 1 1 1 1 ( lllllllllll MM 1 1 1 1 1 1 1 M I II 

CCTCAGACAGAGACGGTGGCT(^GCTCGC^(^GAACTGTACAACAGTGGCTTGTTGGTT 636 
ACACTGATAG CTGACC - TGCAGCTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGAT 2 99 

II I Mill III 1 1 1 1 II Mill Mill II II II M I M I I Mill 

ACTTTAATAGCCCACCTTGCATCTCIATAGATTTTGANGGCAAGAAA.GATGTATCTCAGAT 696 
ATTTAACAACATCTTGAGAAGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAG 35 9 

Ml II Mill MUM llllll 1 1 1 J 1 1 1 1 1 1 [ I I 1 1 M 1 1 1 1 1 1 1 1 ] I 

ATTCNAC - ACATCCTGAGAAAACAGATTGGCACTCGGAGTNC - CCTGTGGAGTATATCAA 754 



I I 



RESULT 10 

US-09-954-456-14 53/C 

; Sequence 1453, Application US/09954456 

; Patent No. US20020115057A1 

; GENERAL INFORMATION: 

; APPLICANT: Young, Paul 

; TITLE OF INVENTION: Process for Identifying Anti-Cancer Therapeutic Agents 

Using Cancer Gene 

; TITLE OF INVENTION: Sets 

; FILE REFERENCE: 68 92 90-76 

; CURRENT APPLICATION NUMBER: US/ 09/ 954 , 456 

; CURRENT FILING DATE: 2001-09-18 

; PRIOR APPLICATION NUMBER: US/60/233,617 

; PRIOR FILING DATE: 2000-09-18 

; PRIOR APPLICATION NUMBER: US/60/234,052 

; PRIOR FILING DATE: 2000-09-20 

PRIOR APPLICATION NUMBER: US/60/234 , 923 

PRIOR FILING DATE: 2000-09-25 

PRIOR APPLICATION NUMBER: US/60/23 5,134 

PRIOR FILING DATE: 2000-09-25 

PRIOR APPLICATION NUMBER: US/60/235 , 637 

PRIOR FILING DATE: 2000-09-26 

PRIOR APPLICATION NUMBER: US/ 6 0/23 5 , 63 8 
; PRIOR FILING DATE: 2000-09-26 

PRIOR APPLICATION NUMBER: US/60/235 , 711 

PRIOR FILING DATE: 2000-09-27 
; PRIOR APPLICATION NUMBER: US/60/235,720 
; PRIOR FILING DATE: 2000-09-27 

PRIOR APPLICATION NUMBER: US/60/23 5 , 84 0 

PRIOR FILING DATE: 2000-09-27 
; PRIOR APPLICATION NUMBER: US/60/235 , 863 

PRIOR FILING DATE: 2000-09-27 

NUMBER OF SEQ ID NOS : 2276 

SOFTWARE: Patentln version 3.0 
; SEQ ID NO 1453 
LENGTH: 387 
TYPE : DNA 

ORGANISM: Homo sapiens 
US-09-954-456-1453 



Query Match 19.2%; Score 195; DB 10; Length 387; 

Best Local Similarity 100.0%; Pred. No. l.le-45; 

Matches 195; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 82 0 TTTCATGTTTTTAAGGTGTTTGTGGCC 879 

1 1 1 1 IN 1 1 1 Ml IN 1 1 1 1 1 1 1 1 III 1 1 1 III 1 1 1 1 1 1 II 1 1 1 II I II Ml II II 1 1 II 

Db 387 TTTCATGTTTTTAAGGTGTTTGTGGCCAGTCCTCACA?\AAC^ 328 

Qy 88 0 CTGTTAAAAAATCAGCCCAAACT^TTGAGTTTCTGAGCAGCTTCCAAAAAGAAAGGACG 939 

M M M M M M M M M M M M M M M M M M M M M M M M M M M M M M 

Db 327 CTGTTAAAAAATCAGCCCAAACTCATTC^ 2 68 

Qy 94 0 GATGATGAGCAGTTCGCTGACGAGAAGAACTACTTGATTAAACAGATCCGAGACTTGAAG 999 

1 1 1 M M 1 1 M MM 1 1 1 1 1 1 1 1 M 1 1 1 1 1 M II 1 1 1 1 M M M 1 1 1 M 1 1 1 1 1 M I II 

Db 267 GATGATGAG CAGTTCG CTGACGAGAAGAACTACTTGATTAAACAGAT CCGAGACTTGAAG 2 08 



Qy 1000 AAAACGGCCCCTTGA 1014 

MMMMMMMI 

Db 2 07 AAAACGGCCCCTTGA 193 



RESULT 11 

US-09-880-107-4 81/C 

; Sequence 481, Application US/09880107 
; Patent No. US20020142981A1 
; GENERAL INFORMATION: 

APPLICANT: Horne, Darci T. 
; APPLICANT: Vockley, Joseph G. 
; APPLICANT: Scherf, Uwe 
; APPLICANT : Gene Logic, Inc. 

TITLE OF INVENTION: Gene Expression Profiles in Liver Cancer 
; FILE REFERENCE: 44 92 1 - 5028 -WO 
; CURRENT APPLICATION NUMBER: US/09/8 8 0,107 
; CURRENT FILING DATE: 2001-06-14 
; PRIOR APPLICATION NUMBER: US 60/211,379 

PRIOR FILING DATE: 2000-06-14 
; PRIOR APPLICATION NUMBER: US 60/237,054 

PRIOR FILING DATE: 2000-10-02 
; NUMBER OF SEQ ID NOS : 3 950 

SOFTWARE: Patent In Ver. 2.1 
; SEQ ID NO 481 
LENGTH: 387 
TYPE: DNA 

ORGANISM: Homo sapiens 
FEATURE : 

OTHER INFORMATION: Genbank Accession No. US20020142981A1 AA234362 
US-09-880-107-481 



Query Match 19.2%; Score 195; DB 10; Length 38 7; 

Best Local Similarity 100.0%; Pred. No. l.le-45; 

Matches 195; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 
Qy 82 0 TTTCATGTTTTTAAGGTGTTTGTGGCCAGTCCTCACAAAACACAGCCTATTGTGGAGATC 879 

M 1 1 1 1 1 1 1 1 1 1 1 II I M 1 1 1 1 1 M M 1 1 III I M 1 1 1 1 1 M M 1 1 1 M 1 1 1 1 1 M I M 

Db 387 TTTCATGTTTTTAAGGTGTTTGTGGCCAGTCCTCACAAAACACAGCCTATTGTGGAGATC 328 



Qy 



88 0 CTGTTAAAAAATCAGCCCAAACT(^TTGAGTTTCTGAGCAGCT^ 93 9 



Db 



327 




Qy 



940 



GATGATGAG CAGTTCG CTGACGAGAAGAACTACTTGATTAAACAGATCCGAGACTTGAAG 9 9 9 



Db 



267 




Qy 



1000 



AAAACGGCCCCTTGA 1014 



Db 



207 




RESULT 12 

US-10-257-826A-118 

; Sequence 118, Application US/10257826A 
; Publication No. US20030181407A1 
; GENERAL INFORMATION: 

; APPLICANT: SA MAJESTE LA REINE DU CHEF DU CANADA 
; APPLICANT: PALIN, Marie-France 
; APPLICANT: POMAR, Candido 
; APPLICANT: GARIEPY, Claude 

TITLE OF INVENTION: Steatosis-modulating factors and uses 

TITLE OF INVENTION: thereof 

FILE REFERENCE: 14654 -2US 

CURRENT APPLICATION NUMBER: US/ 10/257 , 826A 
; CURRENT FILING DATE: 2002-10-17 

PRIOR APPLICATION NUMBER: 60/197936 
; PRIOR FILING DATE: 2000-04-17 
; PRIOR APPLICATION NUMBER: PCT/CAOl/00509 
; PRIOR FILING DATE: 2001-04-12 
; NUMBER OF SEQ ID NOS : 305 

SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 118 
LENGTH: 722 
TYPE : DNA 

ORGANISM: Artificial Sequence 
FEATURE : 

OTHER INFORMATION: Artificial sequence 
OTHER INFORMATION: Muscular steatosis 
OTHER INFORMATION: Porcine 
FEATURE : 

NAME/KEY: misc_feature 
LOCATION: (1) . . . (722) 
OTHER INFORMATION: n = A,T,C or G 
US-10-257-826A-118 

Query Match 16.7%; Score 169.8; DB 13; Length 722; 

Best Local Similarity 60.1%; Pred. No. 3.1e-38; 

Matches 303; Conservative 0; Mismatches 196; Indels 5; Gaps 4; 
Qy 347 TGGAGTATATTAGTGCTCATCCTCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCC 4 06 

III III III I I I I Ml III I I 1 1 1 1 II II 

Db 8 TGGTGAATN CCTCTG C C C C CA CNGAATTTT TGGT CATGGTANTNGAAGGGGATNAATNTT 67 

Qy 4 07 CACAGATTGCCTTACGTTGTGGGATTATGCTGAGAGAATGTATTCGACATGAACCACTTG 466 



Db 




Qy 4 67 CCAAAATCATCCTCTTTTC - -TAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTC 524 

lllllllll! I I I II II I III 1 1 1 1 I II II I III 

Db 12 8 CCAAAATCATTTTGNGGGCCGAACACAGTTTATAGAGATCTTCACATATGTCTAAATGTN 187 

Qy 525 AACA-TTTGATATTGCTTCAGATGCCTTTGCTACTTTCAAGGATTTACTAACCAGACATA 583 

I II III I II III I II I II I I II I Mill I MINI 

Db 18 8 ANCATTTTNACATATCTTTACATNCNNTTNCCNCATTTTNNGNNTTACTTTCACGACATA 24 7 

Qy 584 AAGTGTTGGTAGCAGACTTCTTAGAACAAAATTACGACACTATTTTTGAAGACTATGAGA 643 

I II I II I I I! Mill MM II I MM II Mill 

Db 24 8 TATTGCTCACNGCGCAANTTTTGGAACANCATTATGATANATTTTTCAGTGAATATGATG 307 

Qy 644 AATTGCTTCAGTCTGAGAATTATGTTACTAAGAGACAGTCTTTAAAGCTGCTAGGGGAGC 703 

II llllll III I MINIM I I Mill II I Mill II II II I 

Db 3 08 AAGNGCTTCATTCTTAAAATTATGTGGCCACAAGACAATCACTGAAGCTTCTCGGNGAAC 3 67 

Qy 704 TGATCCTGGACCGTCACAACTTTGCC^ 763 

I I II I II I MM III Mill I II III llllll III 

Db 3 68 TACTACTANATAGACNCNACTTCNCCANTATGACCACATACCTCATTAAACCTGNGNACC 427 

Qy 764 TGAAACTCATGATGAACCTCCTTCGGGATAAAAGTCCCAA - CATCCAGTTTGAAGCCTTT 822 

I I I IMIMMIM I I II 1 1 i 1 1 1 1 II I MM Mill I Ml 

Db 428 T - CCATTAATGATGAACCTGCCTGCAGAGAAAAGTCGGAACCTTCCANTTTGAGGGCTTN 486 

Qy 823 CATGTTTTTAAGGTGTTTGTGGCC 84 6 

II MINIM I I III I 

Db 487 CACGTTTTTAANGGGGNTGTNNNC 510 



RESULT 13 

US-10-257-826A-119 

; Sequence 119, Application US/10257826A 
; Publication No. US2003018 1407A1 
; GENERAL INFORMATION: 

; APPLICANT: SA MAJESTE LA REINE DU CHEF DU CANADA 
; APPLICANT: PALIN, Marie-France 
; APPLICANT: POMAR, Candido 
; APPLICANT: GARIEPY, Claude 

; TITLE OF INVENTION: Steatosis -modulat ing factors and uses 
; TITLE OF INVENTION: thereof 

FILE REFERENCE: 14654-2US 
; CURRENT APPLICATION NUMBER: US/10/257 , 826A 

CURRENT FILING DATE: 2002-10-17 

PRIOR APPLICATION NUMBER: 60/197936 
; PRIOR FILING DATE: 2000-04-17 

PRIOR APPLICATION NUMBER: PCT/CA01/00509 
; PRIOR FILING DATE: 2001-04-12 
; NUMBER OF SEQ ID NOS : 305 

SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 119 
LENGTH: 7 00 
TYPE : DNA 

ORGANISM: Artificial Sequence 
FEATURE : 

OTHER INFORMATION: Artificial sequence 
OTHER INFORMATION: Muscular steatosis 



OTHER INFORMATION: Porcine 
FEATURE : 

NAME/ KEY : misc_f eature 
LOCATION: (1) . . . (700) 
OTHER INFORMATION: n = A,T,C or G 
US-10-257-826A-119 

Query Match 16.4%; Score 166.6; DB 13; Length 700; 

Best Local Similarity 60.1%; Pred. No. 2.6e-37; 

Matches 304; Conservative 0; Mismatches 197; Indels 5; Gaps 4; 

Qy 34 5 TGTGGAGTATATTAGTGCTCATCCTCATATCCTGTTTATGCTCCTCAAAGGATATGAAGC 4 04 

I Ml Ml III I I I I II I III I I I I I I II II 

Db 6 TCTGGTGAATCCCTCTGCCCCCACNGAATTTTTGGTCATGGTANTNGAAGGGGATNAATN 65 

Qy 4 05 CC CACAGATTG CCTTACGTTGTGGGATTATG CTGAGAGAATGTATT CGACATGAAC CACT 4 64 

I I III I I III II II III I Ml 1 1 1 1 II III I III II 

Db 66 TTCCGAAATTTCGATTAATTGGGGNATNATGGTNAGANAATGCCTTNGACCTCCACCGCT 125 

Qy 4 65 TGCCAAAATCATCCTCTTTTC - - TAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTG 522 

IMIIIIIIIII I I I II II I III I I I I I II II I II 

Db 12 6 TGCCAAAATCATTTTGNGGGCCGAACACAGTTTATAGAGA 185 

Qy 523 TCAACA - TTTGATATTGCTTCAGATGCCTTTGCTACTTTCAAGGATTTACTAACCAGACA 581 

I I II III I II 1 1 1 I II I II I I II I Mill I MM 

Db 18 6 TNANCATTTTNACATATCTTTACATNCNNTTNCCNCATTTTNNGNNTTACTTTCACGACA 24 5 

Qy 582 TAAAGTGTTGGTAGC^GACTTCTTAGAACAAAATTACGACACTATTTTTGAAGACTATGA 641 

llllll Ml I II Mill Ml! II I MM II Mill 

Db 246 TATATTG CTCA CNG CG CAANTTTTGGAACANCATTATGATANATTTTT CAGTGAATATGA 3 05 

Qy 642 GAAATTGCTTCAGTCTGAGAATTATGTTACTAAGAGACAGTCTTTAAAGCTGCTAGGGGA 701 

II MUM III I MMMII I I Mill II I Mill II II M 

Db 3 06 TGAAGNG CTT CATT CTTAAAATTATG TGG C CA CAAGA CAATCACTGAAGCTTCT CGGNGA 365 

Qy 7 02 GCTGATCCTGGACCGTCA(^CTTTGCCATCATGACAAAGTATAT(^GCAAGCCGGAG^ 761 

M I II I I I I MM III Mill I II III II II I I I 

Db 3 66 ACTACTACTANATAGACNCNACTTCNCCANTATGACCACATACCTCATTAAACCTGNGNA 425 

Qy 7 62 CCTGAAACTCATGATGAACCTCCTTCGGGATAAAAGTCCCAA - CATCCAGTTTGAAGCCT 820 

Ml I I .III I I II 1 1 ! 1 1 1 i II I MM Mill I II 

Db 42 6 CCT-CCATTAATGATGAACCTGCCTGCAGACAAAAGTCGGAACCTTCCANTTTGAGGGCT 484 

Qy 821 TTCATGTTTTTAAGGTGTTTGTGGCC 84 6 

I II llllllll I I III I 

Db 4 85 TNCACGTTTTTAANGGGGNTGTNNNC 510 



RESULT 14 
US-09-770-445-592 

; Sequence 592, Application US/09770445 
; Patent No. US20020023281A1 
; GENERAL INFORMATION: 

APPLICANT: Gorlach, Jorn 
; APPLICANT: An, Yong-Qiang 
; APPLICANT: Hamilton, Carol M. 
; APPLICANT: Price, Jennifer L. 



APPLICANT: Raines, Tracy M. 
APPLICANT: Yu, Yang 
APPLICANT: Rameaka , Joshua G. 
APPLICANT: Page, Amy 
APPLICANT: Matthew, Abraham V. 
APPLICANT: Ledford, Brooke L . 
APPLICANT: Woessner, Jeffrey P. 
APPLICANT: Haas, William David 
APPLICANT: Garcia, Carlos A. 
APPLICANT: Kricker, Maja 
APPLICANT: Slader, Ted 
APPLICANT: Davis, Keith R. 
APPLICANT: Allen, Keith 
APPLICANT: Hoffman, Neil 
APPLICANT: Hurban, Patrick 

TITLE OF INVENTION: Expressed Sequences of Arabidopsis 
TITLE OF INVENTION: thaliana 
FILE REFERENCE: 2023US ( PARA - 0 1 2 PRV) 
CURRENT APPLICATION NUMBER: US/ 0 9/770 , 44 5 
CURRENT FILING DATE: 2001-01-26 
PRIOR APPLICATION NUMBER: US 60/178,472 
PRIOR FILING DATE: 2000-01-27 
NUMBER OF SEQ ID NOS : 99 9 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 592 
LENGTH: 861 
TYPE: DNA 

ORGANISM: Arabidopsis thaliana 
US-09-770-445-592 



Query Match 15.4%; 
Best Local Similarity 55.8%; 
Matches 319; Conservative 



Score 156; DB 9; Length 861; 
Pred. No. 3.4e-34; 
0; Mismatches 250; Indels 



3 ; Gaps 



QY 
Db 



3 94 GGATATGAAGCCCCACAGATTGCCTTACGTTGTGGGATTATGCTGAGAGAATGTATTCGA 4 53 

M I I I I I M I II II I I I I I III I I I II I Mill II II II 

12 GGGTTTGAAAACACCGATATGGCGTTACACTATGGTACTATGTTTAGAGAGTGCATCCGT 71 



Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 
Db 

Qy 
Db 



4 54 CATGAACCACTTGCCAAAATCATCCTCTTTTCTAATCAATTCAGAGATTTCTTTAAGTAC 513 

72 cItCAGATTGTTGoJlIaTATGTTTTGGACTC^ 13] 

514 GTGGAGTTGTCAACATTTGATATTG CTT CAGATG CCTTTG CTACTTT CAAGGATTTACTA 573 

I II I I I II II MINI I Mill III Mill Mill I II 

132 ATACAGCTTCCCAATTTCGACATTGCTGCTGATGCTGCTGCAACTTTTAAGGAACTTCTG 191 



574 



192 



633 



ACCAGACATAAAGTGTTGGTAGCAGACTTCTTAGAACAAAATTACGACACTATTTTTGAA 

II II II II II II II II I I III I III llllll I 

ACAAGGCACAAGTCTACAGTTGCTGAGTTTCTCATTAAGAATGAAGACTGGTTTTTTGCA 251 



690 



634 GACTATGA GAAATTGCTTCAGTCTGAGAATTATGTTACTAAGAGACAGTCTTTAAAG 

Mill I II I III I II llllll MM I II III III 

252 GACTACAACTCAAAGCTTCTTGAATCAACTAATTATATTACCCGACGGCAAGCTATTAAG 311 
691 CTGCTAGGGGAGCTGATCCTGGACCGTCACAACTTTGCC^TC^TGAa^AGTATATCAGC 750 

IN M II I I MM I llllll Mill MUM I Ml 

312 TTGTTGGGCGATATATTATTGGATAGGTCAAATTCGGCTGTGATGACGAAGTATGTGAGC 371 



Qy 



751 



AAGCCGGAGAACCTGAAACTCATGATGAACCTCCTTCGGGATAAAAGTCCCAACATCCAG 810 



Db 



372 




Qy 



811 



TTTGAAGCCTTTCATGTTTTTAAGGTGTTTGTGGCCAGTCCTCACAAAACACAGCCTATT 870 

I Mill II 1 1 1 1 E 1 1 1 III lllllll II I I MM I II 

ATAGAAGCTTTCC^VTGTTTTCAAGCTGTTTGTAGCGAACCAAAACAAGCCTTCAGACATC 491 



Db 



432 



Qy 



871 



GTGGAGATCCTGTTAAAAAATC^GCCCAAACTCATTGAGTTTCTGAGCAGCTTCCAAAAA 930 

I I M III I III III II I II II || I 

GCCAACATTCTGGTGGC^^CAGAAAO^GCTTCTGAGATTGTTGGCTGATATCAAGCCG 551 



Db 



492 



Qy 



931 



GAAAGGACGGATGATGAGCAGTT CG CTGACGA 962 



Db 



552 



GACAAAGAGGACGAGAGGTTTGACGCAGACAA 583 




RESULT 15 

US-09-923-876-1251 

; Sequence 1251, Application US/09923876 

; Patent No. US20020013958A1 

; GENERAL INFORMATION: 

; APPLICANT: Lalgudi , Raghunath V. 

; APPLICANT: Kamigaki , Laura Y. (I to) 

; APPLICANT: Sherman, Bradley K. 

; TITLE OF INVENTION: POLYNUCLEOTIDES AND POLYPEPTIDES DERIVED FROM CORN 
SEEDLING 

FILE REFERENCE: PL- 00 12-1 CON 
CURRENT APPLICATION NUMBER: US/ 09/ 923 , 8 76 
; CURRENT FILING DATE: 2001-08-06 

PRIOR APPLICATION NUMBER : 09/298,329 
PRIOR FILING DATE: 1999-04-21 
PRIOR APPLICATION NUMBER: 60/085,331 
PRIOR FILING DATE: 1998-05-05 
; NUMBER OF SEQ ID NOS : 6332 

SOFTWARE: PERL Program 
; SEQ ID NO 1251 
LENGTH: 2 62 
TYPE: DNA 
ORGANISM : Zea mays 
FEATURE : 

NAME /KEY : misc_feature 

OTHER INFORMATION: Incyte ID No. US2 0 020013 958A1 700158378H1 
NAME/KEY: unsure 
LOCATION: 14 8 

OTHER INFORMATION: a, t, C, g, or other 
US-09-923-876-1251 

Query Match 7.3%; Score 74.2; DB 9; Length 262; 

Best Local Similarity 55.5%; Pred. No. 6.4e-ll; 

Matches 142; Conservative 0; Mismatches 114; Indels 0; Gaps 0; 
Qy 311 TCTTGAGAAGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGTGCTCATCCTC 370 



Db 




Qy 371 ATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTACGTTGTGGGA 430 

II I II I I II I I III I I I I II II 1 1 1 1 1 1 I 

Db 67 ATCTTTTGGATTTCCTTGTTGTTTGCTATAAGAACTTGGAAGTCGCGTTGAATTGTGGAA 126 

Qy 431 TTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTCTTTTCTAATC 4 90 

Ml II lllllll II I II Mill!! Ml III II I 

Db 127 ACATGTTG CGAGAATG CATAANATAT CCTACACTTGCAAAATATATATTGGAGT CAAGCA 186 

Qy 4 91 AATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCTTCAGATGCCT 550 

Ml II I! III! II 1 1 lllllll II MINI II Mill 

Db 187 G CTT CGAGTTGTTTTT C CAGTATGTTGAATTGT CAAACTT CGATATTG CATCTGATG CTC 24 6 

Qy 551 TTGCTACTTTCAAGGA 566 

I I I I I I M I I I I 

Db 247 TGAACACTTT CAAGGA 262 



Search completed: January 6, 2004, 05:04:45 
Job time : 1400 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 



Run on: 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



January 6, 2004, 01:15:17 ; Search time 2583 Seconds 

(without alignments) 
9541.130 Million cell updates/sec 

US-10-088-872-1 
1014 

1 atgaaaaaaatgcctttgtt tgaagaaaacggccccttga 1014 

I DENTI TY_NUC 

Gapop 10.0 , Gapext 1.0 



Searched: 22781392 seqs, 12152238056 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



45562784 



Database 



EST:* 



1 

2 
3 
4 
5 
6 
7 
8 
9 

10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
24 
25 
26 
27 



ern_estba : * 
em_esthum: * 
em_estin: * 
em_estmu: * 
em__estov: * 
em_estpl : * 
em_estro: * 
em_htc : * 
gb__estl : * 
gb_est2 : * 
gb_htc:* 
gb_est3 : * 
gb_est4 : * 
gb_est5 : * 
em_est f un : * 
em_estom: * 
em_gss_hum: * 
em_gss_inv: * 
em_gss_pln: * 
em_gss_vrt : * 
em_gss_fun: * 
em_gss_mam: * 
em_gss__mus : * 
em_gss_pro : * 
em_gss_rod: * 
em_gss_phg: * 
em__gss_vrl : * 



28 : gb_gssl : * 
29: gb_gss2:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 

o 
"o 

Result Query 



MO . 


Score 


Match 


Length 


DB 


ID 


JJCbLL J. U J. Ul 1 


1 


860 


. 4 


84 


. 9 


1552 


11 


AK076867 


ZXK"n7£ft£7 Mi ics mnapii 


2 


860 


.4 


84 


. 9 


2245 


11 


AK030474 


AK'010474 Mil a mnef-ni 

rtixuju'i / *± nub niuscu 


3 


860 


.4 


84 


. 9 


303 9 


11 


AK0S1642 


Hivu j joiz i v iub muscu 


4 


858 


. 8 


84 


.7 


1377 


11 


AK0767 c ift 


/\rvu / d / do rlUS muscu 


5 


844 


. 8 


83 


.3 


144 9 


11 


AKO 1 19 0 R 


h^uijzuj iyius muscu 


6 


770 


. 6 


76 


.0 


822 


9 


ATI1 9 R 1 07 




7 


750 


. 8 


74 


.0 


1201 


13 






8 


709 


.4 


70 


. 0 


1379 


11 


AKO0S32 3 

ni\u \j *-) 


AK00R791 Mn a mn 0 r«n 
t\i\. \j \J O £* Jj l Y lUo IUU.0L.L1 


9 


671 


.8 


66 


.3 


784 


10 


BG2 18735 


RP91ft71R RQT1ft47£ 


10 


622 


. 8 


61 


. 4 


1281 


11 


AK013161 


n.I\Ul JXUl 1*1 Uo I1LU0L.U 


11 


614 


60 


.6 


951 


13 


BU116522 


RinifiR99 60111Q7ft£ 


12 


594 


. 2 


58 


.6 


982 


13 


cyou zj z> 0 


rJLJb 0 y y AL7£4\lL.UUK 1 


13 


585 


. 2 


57 


. 7 


934 


13 


BUS 188 07 


RTTmftft07 APFMPPT TPT 


14 


579 


.2 


57 


.1 


713 


14 






15 


578 


57 


. 0 


742 


2 


HSM073180 




16 


565 


55 


. 7 


946 


14 


CA973078 


PA 97^ 07ft A PPM PPT TP T 


17 


553 


. 4 


54 


. 6 


958 


1 7 

-L J) 


RTTR 1 4 Q9 0 


dttc: 1 /i Qon APTTMnni tdt 
fcsU3X4yzU ALtCjNLUUK 1 


18 


550 


. 2 


54 


.3 


563 


9 


AA978473 


Z1A97R471 7cfl1h1 0 -r- 


19 


542 


. 6 


53 


. 5 


930 


14 


PA 9 ft 9 ft 9 


LiiJOZDoZ AL7rjiNJL-UUK 1 


20 


536 


.2 


52 


. 9 


732 


14 


rm 03ftoi 




21 


533 


.2 


52 


.6 


1186 


10 


BF159587 


RP1 c i9Rfl7 £017£QOR4 


22 


521. 


.4 


51 


.4 


721 


9 


AW2 4 2 ft 3 9 


ZH\T949Q^Q vnO^fni; -v 


23 


517. 


. 8 


51 


. 1 


985 


13 




rttqi 4 1 ft£ ar i T?Mr , nT tdt 


24 


500, 


.4 


49 


.3 


1060 


13 


ROft 9 9M 7 


RDflQQ^i 7 aocwrni idt 


25 


499. 


.6 


49 


.3 


817 


14 


PRR S9060 


Ldj jjUdU >\Lir!jrJL,UUK 1 


26 


491. 


.8 


48 


. 5 


8 93 


-i- ~j 


RTT1fi7^71 


JbUJb / D/ l bUj / o / / ll 


27 


486. 


. 6 


48 


. 0 


842 


-L ~J 


RPQ4 0£4 7 


DHQ^ 7 AO TTi'KT/— i/~\t TT) T 1 

Dyy4Ub4 / ALrhjiMLUUKl 


28 


473. 


, 6 


46 


. 7 


O ~J r± 


1 4 


pr 9 0 0 4 £ ^ 

L.JDZ UUuD J 


LdzuU4dd AbhJNLUUKI 


29 


461 


45 


. 5 


634 


12 


RP09 114ft 


DHA01 "Jyl Q TTT TJ ntri 


30 


460. 


.8 


45 


.4 


933 


14 


CA983082 


CA983082 AGEN COURT 


31 


457, 


6 


45 


. 1 


575 


12 


BI499153 


BI499153 ie27c05.y 


32 


452 


44 


. 6 


863 


13 


BU905264 


BU905264 AGENCOURT 


33 


451 . 


4 


44 


. 5 


927 


12 


BI655225 


BI655225 603284039 


34 


451. 


2 


44 


.5 


876 


13 


BQ880200 


BQ88 02 00 AGENCOURT 


35 


445. 


2 


43 


. 9 


955 


13 


BU152376 


BU1523 76 AGENCOURT 


36 


445 


43 


. 9 


534 


9 


AI645170 


AI645170 mt82a02.y 


37 


443 . 


8 


43 . 


. 8 


833 


9 


AL961440 


AL961440 AL961440 


38 


442 . 


8 


43 . 


. 7 


853 


14 


CB564338 


CB564338 AGENCOURT 


39 


441 . 


6 


43 . 


. 6 


697 


12 


BI828691 


BI828691 603074707 


40 


435. 


8 


43. 


.0 


1046 


14 


CB235478 


CB2354 78 AGENCOURT 


41 


435. 


2 


42. 


, 9 


440 


9 


AA669484 


AA669484 af74g07.r 


42 


434 . 


2 


42 . 


, 8 


662 


13 


BU631151 


BU631151 UI-H-FE1- 


43 


433 


42 . 


7 


961 


13 


BU129770 


BU129770 603117261 


44 


430 


42 . 


4 


752 


14 


CB524885 


CB524885 UI-M-FY0- 


45 


428 


42 . 


2 


985 


14 


CA970822 


CA97 0822 AGENCOURT 



ALIGNMENTS 



RESULT 1 
AK076867 
LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 
MEDLINE 
PUBMED 

REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 



AK076867 1552 bp mRNA linear HTC 07-DEC-2002 

Mus musculus adult male testis cDNA, RIKEN full-length enriched 
library, clone : 4 930520C08 product : M02 5 -LIKE PROTEIN homolog [Homo 
sapiens] , full insert sequence. 
AK076867 

AK0768 67 . 1 GI : 2634 5723 
HTC; CAP trapper. 
Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Rodent ia; 
1 

Carninci,P. and Hayashizaki , Y . 
High-efficiency full-length cDNA cloning 
Meth. Enzymol. 303, 19-44 (1999) 
99279253 
10349636 
2 

Carninci 
Itoh,M. , 



Craniata; Vertebrata; Euteleostomi ; 
Sciurognathi ; Muridae; Murinae; Mus. 



P., Shibata,Y., Hayatsu,N. , Sugahara,Y., Shibata,K., 
Konno,H., Okazaki,Y., Muramatsu,M. and Hayashizaki , Y. 
Normalization and subtraction of cap-trapper-selected cDNAs to 
prepare full-length cDNA libraries for rapid discovery of new genes 



Genome Res . 
20499374 
11042159 
3 

Shibata , K. , 



10 (10) , 1617-1630 (2000) 



Itoh,M. , Aizawa,K. 
Konno,H., Akiyama,J., Nishi,K. 
Sumi,N., Ishii,Y. # Nakamura,S. 



Sasaki, N., Carninci, P., 
, Tashiro,H., Itoh,M. , 
Nishine,T., Harada,A., 



Nagaoka , S . , 
Kitsunai , T . 
Hazama , M . 

Yamamoto,R., Matsumoto, H . , Sakaguchi , S . , Ikegami,T., Kashiwagi , K. , 

Fujiwake,S. , Inoue,K., Togawa,Y., Izawa,M., Ohara,E., Watahiki,M., 

Yoneda,Y., Ishikawa,T., Ozawa,K., Tanaka,T., Matsuura,S., Kawai,J., 

Okazaki,Y., Muramatsu, M . , Inoue,Y., Kira,A. and Hayashizaki, Y. 

RIKEN integrated sequence analysis (RISA) system--384 -format 

sequencing pipeline with 384 multi capillary sequencer 

Genome Res. 10 (11), 1757-1771 (2000) 

20530913 

11076861 

4 

Kawa i , J . , 
Arakawa , T 
Aizawa, K. 
Saito, T. , 
Kadota , K. 



, Ishii, Y. , 
Fukuda , S . , 



Shinagawa, A. , Shibata,K., Yoshino,M., Itoh,M. 
, Hara,A., Fukunishi , Y . , Konno,H., Adachi,J., 

Izawa,M. , Nishi,K. , Kiyosawa,H., Kondo,S., Yamanaka,I., 
Okazaki,Y., Gojobori,T., Bono,H., Kasukawa,T., Saito,R., 
Matsuda,H., Ashburner,M. , Batalov,S., Casavant,T., 
Fleischmann, W. , Gaasterland, T . , Gissi,C, King,B., Kochiwa,H., 
Kuehl,P., Lewis, S., Matsuo,Y., Nikaido, I . , Pesole,G., 
Quackenbush, J. , Schriml , L . M . , Staubli,F., Suzuki # R. # Tomita,M., 
Wagner, L., Washio,T., Sakai,K., Okido,T., Furuno,M. , Aono,H., 
Baldarelli,R. , Barsh,G., Blake, J., Boffelli,D., Bojunga,N. , 
Carninci, P., de Bonaldo, M . F . , Browns t ein, M . J . , Bult,C, 



Fletcher, C. , Fuj ita , M . , Gariboldi , M . , Gustincich, S . , Hill,D., 
Hofmann,M., Hume,D.A., Kamiya,M., Lee,N.H., Lyons, P., 
Marchionni,L. , Mashima,J., Mazzarelli , J . , Mombaerts , P . , Nordone,P., 
Ring,B., Ringwald,M., Rodriguez , I . , Sakamoto, N . , Sasaki, H. , 
Sato,K., Schonbach, C. , Seya,T., Shibata,Y., Storch,K.F., Suzuki, H. , 
Toyo-oka,K., Wang,K.H., Weitz,C, Whittaker , C. , Wilming, L. , 
Wynshaw-Boris,A. , Yoshida,K., Hasegawa,Y., Kawaji,H., Kohtsuki,S. 
and Hayashizaki, Y. 

TITLE Functional annotation of a full-length mouse cDNA collection 

JOURNAL Nature 409 (6821) , 685-690 (2001) 

MEDLINE 21085660 
PUBMED 11217851 
REFERENCE 5 

AUTHORS The FANTOM Consortium and the RIKEN Genome Exploration Research 
Group Phase I & II Team. 

TITLE Analysis of the mouse transcriptome based on functional annotation 

of 60,770 full-length cDNAs 

JOURNAL Nature 420, 563-573 (2002) 
REFERENCE 6 (bases 1 to 1552) 

AUTHORS Adachi,J., Aizawa,K., Akahira,S., Akimura,T., Aono,H., Arai,A., 
Arakawa,T., Bono,H., Carninci,P., Fukuda,S., Fukunishi , Y . , 
Furuno,M., Hanagaki,T., Hara,A., Hayatsu,N., Hiramoto,K., 
Hiraoka,T., Hori,F., Imotani,K., Ishii,Y., Itoh,M. , Izawa,M., 
Kasukawa,T., Kato,H., Kawai,J., Kojima,Y., Konno,H., Kouda,M., 
Koya,S., Kurihara,C, Matsuyama, T . , Miyazaki , A. , Nishi,K., 
Nomura, K. , Numazaki,R., Ohno,M., Okazaki,Y., Okido,T., Owa,C, 
Saito,H., Saito,R., Sakai,C, Sakai,K., Sano,H., Sasaki, D. , 
Shibata,K., Shibata,Y., Shinagawa,A. , Shiraki,T., Sogabe,Y. , 
Suzuki, H., Tagami,M., Tagawa,A., Takahashi , F . , Tanaka,T., 
Tejima,Y., Toya,T., Yamamura,T., Yamanaka,I., Yasunishi , A. , 
Yoshida,K., Yoshino,M., Muramatsu, M . and Hayashizaki , Y . 

TITLE Direct Submission 

JOURNAL Submitted ( 16 -APR-2002 ) Yoshihide Hayashizaki, The Institute of 
Physical and Chemical Research (RIKEN) , Laboratory for Genome 
Exploration Research Group, RIKEN Genomic Sciences Center (GSC) , 
RIKEN Yokohama Institute; 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, 
Kanagawa 23 0-0045 , Japan (E-mail : genome- res@gsc . riken .go . jp, 
URL : http : / /genome . gsc . riken . go . jp/ , Tel :81-45-503-9222, 
Fax:81-45-503-9216) 
COMMENT cDNA library was prepared and sequenced in Mouse Genome 

Encyclopedia Project of Genome Exploration Research Group in Riken 
Genomic Sciences Center and Genome Science Laboratory in RIKEN . 
Division of Experimental Animal Research in Riken contributed to 
prepare mouse tissues. 

Please visit our web site for further details. 
URL : ht tp : / / genome . gsc . riken . go . j p/ 
URL: http : // fantom. gsc . riken . go . jp/ . 
FEATURES Loca t ion/ Qua 1 i f i ers 

source 1 . . 1552 

/organism="Mus mus cuius" 

/mol_type="mRNA" 

/strain="C57BL/6J" 

/ db_x r e f = " FANTOM_DB :4930520C08" 

/db_xref="MGI : 1894876" 

/db_xref ="taxon: 10090" 

/clone="4930520C08" 

/sex="male" 



/tissue_type="testis" 

/clone_lib="RIKEN full-length enriched mouse cDNA library" 
/dev_s tage= " adult " 
CDS 316. .1320 

/note= "unnamed protein product; M025-LIKE PROTEIN homolog 

[Homo sapiens] (SWISSPROT | Q9H9S4 , evidence: FASTY , 

98.2%ID, 100%length, match=1002) 

putative" 

/ codon_start=l 

/protein_id="BAC36513 .1" 

/db_xref="GI : 26345724" 

/db_xref ="MGI : 1914081" 

/translations "MPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQ 

AMKEILCGTNDKEPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNIL 

RRQIGTRCPTVEYI SSHPHI LFMLLKGYEAPQI ALRCGI MLRECI RHEPLAKI I LFSN 

QFRDFFKYVELSTFDIASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSEN 

YVTKRQS LKLLGEL I LDRHN FT IMTKYISKP ENLKLMMNLLRDKS PN I QF EAFH VF KV 

FVAS PHKTQP I VEI LLKNQPKLI EFLSSFQKERTDDEQFADEKNYLI KQI RDLKKAAP 
it 

polyA_signal 1539. .1544 

/not e= "putat ive " 
polyA_site 1552 

/note= "putat ive" 
BASE COUNT 490 a 320 c 341 g 401 t 

ORIGIN 



Query Match 84.9%; Score 860.4/ DB 11; Length 1552; 

Best Local Similarity 90.5%; Pred. No. 5.8e-176; 

Matches 918; Conservative 0; Mismatches 96; Indels 0; Gaps 0; 



Qy 


i 


ATGAAAAAAATGCCTTTGTTTAGTAAATCACACAAAAATCCAGC^ 


60 






1 1 II II 1 ! 1 1 Ml 1 III 1 1 M 1 1 1 1 Ml 1 1 1 1 1 1 1 1 1 II 1 1 ! 1 1 1 II 1 1! M Mill 




Db 


307 


ATGAAAAAAATGCCTTTGTTTAGTAAATCACACAAAAATCC 


366 


Qy 


61 


CTGAAAGACAATTTGGCCATTTTGGAAAAGCAAGAGAAAAAGACAG 


120 






MINIUM M II 1 M M 1 1 1 1 Ml 1 1 1 1 M 1 II 1 1 M 1 1 1 M 1 1 1 1 M 1 1 M M 




Db 


367 


CTGAAAGACAACCTGGCCATTTTGGAAAAGCAAGACAAAAAGA(^ 


426 


Qy 


121 


GAAGTGTCTAAATCACTGCAAGCAATGAAAGAAATTCTGTGTGGTAC 


180 






M MIM INN MMMMMMM 1 1 ! 1 1 1 ! 1 1 1 1 1 1 1 1 1 MMI 1 1 II 




Db 


427 


GAGGTGTOUUUyrCTCTGCjAAGCAATGAAGGAAA 


486 


Qy 


181 


CCC CCAACAGAAGCAGTGG CTCAG CTAG CACAAGAACT CTACAGCAGTGG CCTG CTAGTG 


240 






MIM II III MM III MINIM II II II II II III II II II MM III 




Db 


487 


CCCCCTACAGAAGCAGTGGCTCA.GCTGGCGCAGGAGCTCTACAGCA 


546 


Qy 


241 


ACACTGATAGCTGACCTGCAGCTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATA 


300 






Mill i II MMI . 1 1 1 ! 1 1 1 1 1 N 1 ! 1 1 1 II 1 1 1 II 1 1 1 1 1 1 II 1 1 1 




Db 


547 


ACACT CATAGCTGAC CTGCAG CTCATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATA 


606 


Qy 


301 


TTTAACAACATCTTGAGAAGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGT 


360 






II Ml II 1 1 II 1 1 II II II 1 II II III MIMMIII NIM II III 




Db 


607 


TTCAACAACJATCCTGAGAAGACAGATTGGTACACGGTGTCCTACTGTCGAGTACATCAGT 


666 


Qy 


361 


GCTCATCCTCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCC^CAGATTGCCTTA 


420 






1 II 1 1 M 1 M MMMMMMM MINI II 1 II II 1 1 1 II II 1 II 1 1 II 1 1 1 




Db 


667 


TCTCATCCTCACATCCTGTTTATGCTTCTCAAAGGCTATGAAGCCCCACAGATTGCCTTA 


726 



Qy 


421 


Db 


727 


Qy 


481 


Db 


787 


Qy 


541 


Db 


847 


Qy 


601 


Db 


907 


Qy 


661 


Db 


967 


Qy 


721 


Db 


1027 


Qy 


781 


Db 


1087 


Qy 


841 


Db 


1147 


Qy 


901 


Db 


1207 


Qy 


961 


Db 


1267 



CGTTGTGGGATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTC 4 80 

M 1 1 1 1 1 1 1 1 1 1 1 1 1 Mill I I I I I I I I i I I I I I I I I ! II I I I I I I II I I I I I 

CGCTGTGGGATTATGCTAAGAGAGTGTATTCGACATGAGCCACTTGCCAAAATCATCCTA 786 
TTTTCTAATCAATTOVGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCT 54 0 

MIMIIIIII llllllllllllll Mill II III 1 1 1 1 II MINIM Ml 

TTTTCTAATCAGTTCAGAGATTTCTTCAAGTATGTTGAGCTGTCCACCTTTGATATCGCT 846 
TCAGATGCCTTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGAC 600 

MIMIIIIII MINIM 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 IMMMIMM 

TCAGATGCCTTCGCTACTTTTAAGGATTTGTTAACCAGACATAAAGTATTGGTAGCAGAC 906 
TTCTTAGAA(^UU^TTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAG 660 

I i 1 1 1 1 1 M 1 1 1 M 1 1 1 1 1 1 M 1 1 1 M 1 1 1 1 1 1 1 1 1 1! 1 1 1 MM II MUM 

TTCTTAGAACAAAATTATGACACTATTTTTGAAGACTATGAGAAACTGCTGCAATCTGAG 966 



M Mill II MINIM MINIMI 1 1 1 1 1 1 1 1 1 1 1 J 1 1 1 1 1 1 E I f 1 1 1 III 



N N INI INN NNNNNNINN llllllllllllll I IT! 



CTCCTTCGGGATAAAAGTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTT 84 0 

M Mill M 1 1 h 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 Mill MIMIIIIII 

CTGCTTCGAGACAAAAGTCCCAACATCCAATTCGAAGCCTTCCATGTCTTTAAGGTGTTT 114 6 

GTGGCCAGTCCTCACAAAACAC^GCCTATTGTGGAGATCCTGTTAAAAAATCAGCCCAAA 900 

I I I I I I I I I f I I I I I I I I II II II II II I II II I II II II II I I II M I II I II I 

GTGGCCAGCCCCCACAAAACGCAGCCTATCGTGGAGATTCT 12 06 

CTCATTGAGTTTCTGAG CAG CTTCCAAAAAGAAAGGACGGATGATGAG CAGTTCG CTGAC 960 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 II II MM I II I II 

CTCATTGAGTTTCTGAGCAGCTTTCAGAAAGAAAGGACAGACGACGAGCAGTTTGCTGAC 1266 
GAGAAGAACTACTTGATTAAACAGATCCGAGACTTGAAGAAAACGGCCCCTTGA 1014 

f 1 1 1 1 1 1 1 1 1 i I .MM i I NIIIIIIIIIMII I INN IN 



RESULT 2 
AK030474 
LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



AK030474 2245 bp mRNA linear HTC 05-DEC-2002 

Mus musculus adult male pituitary gland cDNA, RIKEN full-length 
enriched library, clone : 5330416K15 product : M02 5 -LIKE PROTEIN 
homolog [Homo sapiens] , full insert sequence. 
AK03 0474 

AK030474 . 1 GI :26326468 
HTC; CAP trapper. 
Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 

Carninci,P. and Hayashizaki , Y. 



TITLE 
JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 



TITLE 
JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
REFERENCE 
AUTHORS 



High-efficiency full-length cDNA cloning 

Meth. Enzymol. 303, 19-44 (1999) 

99279253 

10349636 

2 

Carninci 
Itoh,M. , 



P., Shibata,Y., Hayatsu,N. , Sugahara,Y., Shibata,K., 
Konno,H., Okazaki,Y., Muramatsu , M . and Hayashizaki , Y . 
Normalization and subtraction of cap-trapper-selected cDNAs to 
prepare full-length cDNA libraries for rapid discovery of new genes 



Genome Res 
20499374 
11042159 
3 

Shibata,K. 



10 (10), 1617-1630 (2000) 



Itoh,M. , Aizawa,K. 
Konno,H., Akiyama,J., Nishi,K. 
Sumi,N., Ishii,Y., Nakamura,S. 



Carninci, P. , 
, Itoh,M., 
Harada , A. , 



, Ishii , Y . , 
Fukuda , S . , 



Nagaoka , S . , Sasaki , N . , 
Kitsunai,T. , Tashiro,H 
Hazama , M . , Nishine , T . , 
Yamamoto,R., Matsumoto, H . , Sakaguchi , S . , Ikegami,T., Kashiwagi , K. , 
Fujiwake,S. , Inoue,K., Togawa,Y., Izawa,M. , Ohara,E w Watahiki,M., 
Yoneda,Y., Ishikawa,T., Ozawa,K., Tanaka,T., Matsuura,S., Kawai,J., 
Okazaki,Y., Muramatsu, M. , Inoue,Y., Kira,A. and Hayashizaki, Y. 
RIKEN integrated sequence analysis (RISA) system- -384 - format 
sequencing pipeline with 384 mul ticapillary sequencer 
Genome Res. 10 (11), 1757-1771 (2000) 
20530913 
11076861 
4 

Kawai,J., Shinagawa, A. , Shibata,K., Yoshino,M. , ltoh,M. 
Arakawa,T., Hara,A., Fukunishi , Y . , Konno,H., Adachi,J., 
Aizawa,K., Izawa,M., Nishi,K., Kiyosawa,H., Kondo,S., Yamanaka,I., 
Saitoj., Okazaki,Y., Gojobori, T. , Bono,H. , Kasukawa,T., Saito,R. , 
Kadota,K., Matsuda,H., Ashburner,M. , Batalov,S., Casavant,T. , 
Fleischmann,W. , Gaasterland, T . , Gissi,C, King,B., Kochiwa,H., 
Kuehl,P., Lewis ,S., Matsuo,Y., Nikaido,I., Pesole,G w 
Quackenbush, J. , Schriml , L . M . , Staubli,F., Suzuki ,R., Tomita,M., 
Wagner, L. , Washio,T., Sakai,K., Okido,T., Furuno,M. , Aono,H., 
Baldarelli,R. , Barsh,G., Blake, J., Boffelli,D., Bojunga,N. , 
Carninci, P., de Bonaldo,M. F . , Browns tein, M. J . , Bult,C, 
Fletcher, C, Fujita,M., Gariboldi , M . , Gustincich, S . , Hill,D., 
Hofmann,M., Hume, D. A., Kamiya,M., Lee , N . H . , Lyons , P . , 
Marchionni,L. , Mashima,J., Mazzarelli , J . , Mombaerts , P . , Nordone,P., 
Ring,B. # Ringwald,M., Rodriguez , I . , Sakamoto, N., Sasaki ,H., 
Sato,K., Schonbach,C. , Seya,T., Shibata,Y., Storch, K. F . , Suzuki,H., 
Toyo-oka,K., Wang,K.H., Weitz,C., Whittaker, C. , Wilming,L. , 
Wynshaw-Boris,A. , Yoshida,K., Hasegawa,Y., Kawaji,H., Kohtsuki,S. 
and Hayashizaki , Y. 

Functional annotation of a full-length mouse cDNA collection 

Nature 409 (6821) , 685-690 (2001) 

21085660 

11217851 

5 

The FANTOM Consortium and the RIKEN Genome Exploration Research 
Group Phase I & II Team. 

Analysis of the mouse transcriptome based on functional annotation 
of 60,770 full-length cDNAs 
Nature 420, 563-573 (2002) 
6 (bases 1 to 2245) 

Ada chi, J., Aizawa,K., Akimura,T., Arakawa,T., Bono,H., Carninci, P., 



TITLE 
JOURNAL 



COMMENT 



FEATURES 

source 



CDS 



BASE COUNT 
ORIGIN 



Fukuda,S., Furuno , M . , Hanagaki,T., Hara,A., Hashizume, W . , 
Hayashida, K. , Hayatsu,N. , Hiramoto,K., Hiraoka,T. , Hirozane,T., 
Hori,F., Imotani,K., Ishii,Y., ltoh,M. , Kagawa,I. # Kasukawa,T., 
Katoh,H., Kawai # J. # Kojima # Y., Kondo,S., Konno,H., Kouda,M. , 
Koya,S., Kurihara,C. # Matsuyama, T. , Miyazaki,A., Murata,M. , 
Nakamura , M . , Nishi,K., Nomura, K., Numazaki,R., Ohno,M. , 0hsato,N. , 
Okazaki,Y., Saito,R., Saitoh, H. , Sakai,C, Sakai,K. , Sakazume,N., 
Sano # H. # Sasaki, D. , Shibata,K., Shinagawa,A. , Shiraki,T., 
Sogabe,Y., Tagami,M., Tagawa,A., Takahashi , F . , Takaku-Akahira , S . , 
Takeda,Y., Tanaka,T., Tomaru,A. , Toya,T., Yasunishi , A. , 
Muramatsu,M. and Hayashizaki , Y . 
Direct Submission 

Submitted (16-JUL-2001) Yoshihide Hayashizaki, The Institute of 
Physical and Chemical Research (RIKEN) , Laboratory for Genome 
Exploration Research Group, RIKEN Genomic Sciences Center (GSC) , 
RIKEN Yokohama Institute; 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, 
Kanagawa 230-0045, Japan ( E-mail : genome-res@gsc . riken . go . jp, 
URL:http : / /genome. gsc . riken. go . jp/, Tel : 8 1-4 5-503 - 9222 , 
Fax: 81-45-503-9216) 

cDNA library was prepared and sequenced in Mouse Genome 
Encyclopedia Project of Genome Exploration Research Group in Riken 
Genomic Sciences Center and Genome ^Science Laboratory in RIKEN. 
Division of Experimental Animal Research in Riken contributed to 
prepare mouse tissues. 

Please visit our web site for further details. 
URL:http: //genome .gsc . riken. go . jp/ 
URL:http : / /f antom. gsc . riken .go . jp/ . 

Location/Qualifiers 

1. .2245 

/organism= M Mus musculus" 

/mol_type=" mRNA " 

/strain="C57BL/6J" 

/ db_xr e f - " FANTOM_DB :5330416K15" 

/db_xref ="taxon: 10090" 

/clone="533 0416K15" 

/ sex=" male" 

/tissue_type="pituitary gland" 

/clone_lib=" RIKEN full-length enriched mouse cDNA library" 
/dev_stage= "adult " 
300. .1313 

/note= "unnamed protein product; M025-LIKE PROTEIN homolog 

[Homo sapiens] (SWISSPROT | Q9H9S4 , evidence: FASTY, 

98.2%ID, 100%length, match=1002) 

putative" 

/ codon_start=l 

/protein_id="BAC26978 .1" 

/db_xref ="GI : 26326469" 

/ 1 rans la t ion= " MKKMPLFSKSHKNPAEI VKI LKDNLAI LEKQDKKTDKASEEVSK 
SLQAMKEI LCGTNDKEPPTEAVAQLAQELYSSGLLVTLI ADLQLI DFEGKKDVTQI FN 
NILRRQIGTRCPTVEYISSHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIIL 
FSNQFRDFFKYVELSTFDIASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQ 
SENYVTKRQSLKLLGELILDRHNFTIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHV 
FKVFVASPHKTQP I VEI LLKNQPKLI EFLSSFQKERTDDEQFADEKNYLI KQI RDLKK 
AAP" 

667 a 480 c 517 g 581 t 



Query Match 84.9%; Score 860.4; DB 11; Length 2245; 

Best Local Similarity 90.5%; Pred. No. 6.2e-176; 

Matches 918; Conservative 0; Mismatches 96; Indels 0; Gaps 0; 
Qy 1 ATGAAAAAAATG C CTTTGTTTAGTAAATCACA CAAAAAT C CAG CAGAAATTGTGAAAATC 60 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 3 00 ATGAAAAAAATGCCTTTGTTTAGTAAATCACACAAAAATC 359 

Qy 61 CTGAAAGACAATTTGGCCATTTTGGAAAAGCAAG 120 

Illllllllll I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 

Db 3 60 CTGAAAGACAAC CTGG C CATTTTGGAAAAG CAAGACAAAAAGACAGACAAGGCTT CAGAA 419 

Qy 121 GAAGTGTCTAAAT CACTG CAAGCAATGAAAGAAATTCTGTGTGGTACAAACGAGAAAGAA 18 0 

II MM! lllll MINIMUM 1 1 llllllllllllll II Mill II II 

Db 42 0 GAGGTGTCAAAATCTCTGCAAGCAATGAAGGAAATTCTGTGTGGAACGAACGACAAGGAG 479 

Qy 181 CCCCCAAC^GAAGC^GTGGCTCAGCTAGCACAAGAACTCTACAGCAGTGGCCTGCTAGTG 24 0 

Mill III I II MM 1 1 lllillll II II II Illllllllll II MM III 

Db 4 8 0 CCCCCTACAGAAGCAGTGGCTCAGCTGGCGCAGGAGCTCTACAGCAGCGGGTTGCTGGTG 53 9 

Qy 241 ACACTGATAGCTGACCTG CAGCTGATAGACTTTGAGGGAAAAAAAGATGTGAC CCAGATA 3 00 

lllll lllllllllllllllll II I II I II II II II II I II II II I M M I II II II I 

Db 54 0 ACACT CATAG CTGACCTGCAG CT CATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATA 5 99 

Qy 3 01 TTTAACAACATCTTGAGAAGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGT 360 

II IMIIIIII MMMMIMM II II III Illlllllll lllll II III 

Db 600 TTCAACAACATCCTGAGAAGACAGATTGGTACACGGTGTCCTACTGTCGAGTACATCAGT 659 

Qy 3 61 GCTCATCCTCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTA 42 0 

Illlllllll llllllllllllll lllillll MMIMMMIMMIMIMM 
Db 660 TCTCATCCTCACATCCTGTTTATGCTTCTCAAAGGCTATGAAGCCCCACAGATTGCCTTA 719 

Qy 421 CGTTGTGGGATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTC 48 0 

II I'll.: Mill I i 1 1 i 1 1 i I ! 1 1 1 1 i 1 1 1 i I ! 1 1 1 1 1 1 1 1 

Db 72 0 CGCTGTGGGATTATGCTAAGAGAGTGTATTCGACATGAGCCACTTGCCAAAATCATCCTA 779 

Qy 481 TTTTCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCT 54 0 

Illllllllll llllllllllllll lllll II III MM II IMMIII III 

Db 78 0 TTTTCTAATCAGTTCAGAGATTTCTTCAAGTATGTTGAGCTGTCCACCTTTGATATCGCT 83 9 

Qy 541 TCAGATGCCTTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGAC 600 

Illllllllll IMMIII IMMIII IMMMIIMIMM MINIMUM 

.Db 84 0 TCAGATGCCTTCGCTACTTTTAAGGATTTGTTAACCAGACATAAAGTATTGGTAGCAGAC 8 99 

Qy 601 TTCTTAGAACAAAATTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAG 660 

lllllllllllllllll 1 1 1 1 1 1 1 ! I ! 1 1 1 1 1 INI II www 

Db 9 00 TTCTTAGAACAAAATTATGACACTATTTTTGAAGACTATGAGAAACTGCTGCAATCTGAG 959 

Qy 661 AATTATGTTACTAAGAGACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCAC 72 0 

II INN II lllillll MINIMI lllllll lllllllllllllllll IN 

Db 960 AACTATGTGACAAAGAGACAATCTTTAAAGTTGCTAGGTGAGCTGATCCTGGACCGCCAC 1019 

Qy 721 AACTTTGCCATC^TGACAAAGTATATCA.GC^AGCCGGAGAACCTGAAACTCATGATGAAC 78 0 

II II INI lllll lllllllllllllllll NINNNINN 1 1 1 1 1 1 M I 

Db 102 0 AATTTCACCATTATGACCAAGTATATCAGCAAGCCAGAGAACCTGAAACTGATGATGAAC 107 9 



Qy 781 CTCCTTCGGGATAAAAGTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTT 84 0 



Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 



II Mill II 1 1 1 II M I II M 1 1 1 II II MINIM Mill IIMIIIIIIII 

108 0 CTGCTTCGAGACAAAAGTCCCAACATCCAATTCGAAGCCTTCCATGTCTTTAAGGTGTTT 113 9 
841 GTGGCC^GTCCTCACAAAAC^CAGCCTATTGTGGAGATCCTGTTAAAAAATCTVGCCCA^ 900 

lllllll || MINIM .Mill 1 . MINI 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 II 1 1 

114 0 GTGGCCAGCCCCCACmAACGCAGCCTATCGTGGAGATTCTGTTAAAAAATCAGCCCAAA 119 9 
901 CTCATTGAGTTTCTGAGCAGCTTCCAAAAAGAAAGGACGGATGATGAGCAGTTCGCTGAC 960 

1 1 1 1 1 1 1 1 1 1 : 1 1 : 1 1 1 1 1 1 1 ii iiiiiiiiiii ii ii iiiiiiii mini 

12 00 CT CATTGAGTTTCTGAG CAG CTTTCAGAAAGAAAGGACAGACGACGAG CAGTTTGCTGAC 12 5 9 
961 GAGAAGAACTACTTGATTAAACAGATCCGAGACTTGAAGAAAACGGCCCCTTGA 1014 

IIIIIIIIMII lllllllllllll MM 1 1 1 1 1 1 1 1 1 1 I Mill III 

126 0 GAGAAGAACTACCTGATTAAACAGATTCGAGACTTGAAGAAAG CAG CC CCGTGA 1313 



RESULT 3 
AK053642 
LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 
MEDLINE 
PUBMED 

REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 



AK053642 3039 bp mRNA linear HTC 05-DEC-2002 

Mus musculus 0 day neonate eyeball cDNA, RIKEN full-length enriched 
library, clone : E130116O21 product :M02 5 -LIKE PROTEIN homolog [Homo 
sapiens] , full insert sequence. 
AK053642 

AK053642 .1 GI : 26343 6 00 
HTC; CAP trapper. 
Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 

Carninci,P. and Hayashizaki , Y . 

High-efficiency full-length cDNA cloning 

Meth. Enzymol. 303, 19-44 (1999) 

99279253 

10349636 

2 

Carninci,P., Shibata,Y., Hayatsu,N., Sugahara,Y., Shibata,K., 
Itoh,M., Konno,H., Okazaki,Y., Muramatsu,M. and Hayashizaki , Y . 
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/organism="Mus musculus" 
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putative" 

/codon_start=l 
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Query Match 84.9%; Score 860.4; DB 11; Length 3039; 

Best Local Similarity 90.5%; Pred. No. 6.6e-176; 

Matches 918; Conservative 0; Mismatches 96; Indels 0; Gaps 0; 
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RESULT 4 
AK076758 

LOCUS AK076758 1377 bp mRNA linear HTC 07-DEC-2002 

DEFINITION Mus musculus adult male testis cDNA, RIKEN full-length enriched 

library, clone : 4930433N18 product : M02 5 -LIKE PROTEIN homolog [Homo 

sapiens] , full insert sequence. 
ACCESS I ON AKO 767 58 



VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 
MEDLINE 
PUBMED 

REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 



TITLE 
JOURNAL 
MEDLINE 
PUBMED 



AK076758 . 1 GI : 2 634 5637 
HTC; CAP trapper. 
Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Crania ta; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 

Carninci, P. and Hayashizaki , Y . 

High-efficiency full-length cDNA cloning 

Meth. Enzymol. 303, 19-44 (1999) 

99279253 

10349636 

2 

Carninci, P., Shibata,Y., Hayatsu,N., Sugahara,Y., Shibata,K., 
Itoh,M., Konno,H., Okazaki,Y., Muramatsu,M. and Hayashizaki , Y . 
Normalization and subtraction of cap-trapper-selected cDNAs to 
prepare full-length cDNA libraries for rapid discovery of new genes 



Genome Res 
20499374 
11042159 
3 

Shibata, K. 



10 (10), 1617-1630 (2000) 



Itoh,M. 



Aizawa , K. 
Konno,H., Akiyama,J., Nishi,K. 
Sumi,N., Ishii,Y. # Nakamura , S . 



Carninci , P . 

Itoh,M. , 
Harada,A. , 



Nagaoka , S . , Sasaki , N . , 
Kitsunai,T., Tashiro,H. 
Hazama,M., Nishine,T., 
Yamamoto,R. , Matsumoto , H . , Sakaguchi , S . , Ikegami,T., Kashiwagi , K. , 
Fuj iwake, S . , Inoue,K., Togawa,Y. , Izawa,M. , Ohara,E., Watahiki,M., 
Yoneda,Y., Ishikawa,T., Ozawa,K., Tanaka,T., Matsuura^. , Kawai, J. , 
Okazaki,Y., Muramatsu, M . , Inoue,Y., Kira,A. and Hayashizaki , Y . 
RIKEN integrated sequence analysis (RISA) system--384-format 
sequencing pipeline with 384 mult icapillary sequencer 
Genome Res. 10 (11), 1757-1771 (2000) 
20530913 
11076861 
4 

Kawa i , J . 
Arakawa , T . 
Aizawa , K. , 
Saito,T. , 
Kadota, K. 



f Ishii , Y . , 
Fukuda , S . , 



Shinagawa,A. , Shibata, K., Yoshino,M. , Itoh,M. 

Hara,A., Fukunishi , Y . , Konno,H., Adachi,J., 
, Izawa,M., Nishi,K., Kiyosawa , H . , Kondo,S., Yamanaka,I., 
Okazaki,Y., Goj obori , T . , Bono,H., Kasukawa,T., Saito,R., 
r Matsuda,H., Ashburner , M . , Batalov,S., Casavant,T., 
Fleischmann, W. , Gaasterland, T. , Gissi f C. # King,B., Kochiwa,H., 
Kuehl,P., Lewis, S., Matsuo,Y., Nikaido,I w Pesole,G. , 
Quackenbush, J. , Schriml,L.M. , Staubli,F. , Suzuki, R. , Tomita,M. , 
Wagner, L., Washio,T., Sakai,K., Okido,T., Furuno,M., Aono,H., 
Baldarelli,R. , Barsh,G., Blake, J. , Boffelli,D., Bojunga,N w 
Carninci, P., de Bonaldo, M . F . , Brownstein, M . J . , Bult,C, 
Fletcher, C. , Fujita,M. # Gariboldi , M . , Gustincich, S . , Hill,D., 
Hofmann,M., Hume, D. A. , Kamiya,M. , Lee, N . H . , Lyons, P., 
Marchionni,L. , Mashima,J., Mazzarell i , J . , Mombaerts , P . , Nordone,P., 
Ring,B., Ringwald,M., Rodriguez , I . , Sakamoto, N., Sasaki # H., 
Sato,K., Schonbach, C. , Seya,T., Shibata, Y., Storch,K.F., Suzuki, H. , 
Toyo-oka, K. , Wang,K.H., Weitz,C, Whittaker , C . , Wilming,L., 
Wynshaw-Boris,A. , Yoshida,K., Hasegawa,Y., Kawaji,H., Kohtsuki,S. 
and Hayashizaki , Y . 

Functional annotation of a full-length mouse cDNA collection 

Nature 4 09 (6821), 685-690 (2001) 

21085660 

11217851 



REFERENCE 5 

AUTHORS The FANTOM Consortium and the RIKEN Genome Exploration Research 
Group Phase I & II Team. 

TITLE Analysis of the mouse transcriptome based on functional annotation 

of 60,770 full-length cDNAs 

JOURNAL Nature 420, 563-573 (2002) 
REFERENCE 6 (bases 1 to 1377) 

AUTHORS Adachi,J., Aizawa,K., Akahira,S., Akimura,T., Aono,H., Arai,A., 
Arakawa,T., Bono,H., Carninci,P., Fukuda,S., Fukunishi , Y . , 
Furuno,M. , Hanagaki,T., Hara,A. , Hayatsu,N. , Hiramoto,K., 
Hiraoka,T., Hori,F., Imotani , K. , Ishii,Y., Itoh,M., Izawa,M. , 
Ka sukawa , T . , Kato , H . , Kawa i , J . , Ko j ima , Y . , Konno , H . , Kouda , M . , 
Koya,S., Kurihara,C, Matsuyama , T . , Miyazaki , A . , Nishi,K., 
Nomura, K., Numazaki,R., Ohno,M. , Okazaki,Y., Okido,T., Owa,C, 
Saito,H., Saito,R., Sakai,C, Sakai,K., Sano,H., Sasaki # D., 
Shibata,K., Shibata,Y., Shinagawa , A. , Shiraki,T., Sogabe,Y. # 
Suzuki , H . , Tagami , M . , Tagawa , A . , Takahashi , F . , Tanaka , T . , 
Tejima,Y., Toya,T., Yamamura,T., Yamanaka , I . , Yasunishi , A. , 
Yoshida,K., Yoshino,M., Muramatsu,M. and Hayashizaki , Y . 

TITLE Direct Submission 

JOURNAL Submitted (16-APR-2002) Yoshihide Hayashizaki, The Institute of 
Physical and Chemical Research (RIKEN) , Laboratory for Genome 
Exploration Research Group, RIKEN Genomic Sciences Center (GSC) , 
RIKEN Yokohama Institute; 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, 
Kanagawa 23 0-004 5 , Japan (E-mail : genome -res@gsc . riken.go . jp, 
URLihttp : / /genome. gsc . riken.go . jp/, Tel : 81-45-503-9222, 
Fax:81-45-503-9216) 
COMMENT cDNA library was prepared and sequenced in Mouse Genome 

Encyclopedia Project of Genome Exploration Research Group in Riken 
Genomic Sciences Center and Genome Science Laboratory in RIKEN. 
Division of Experimental Animal Research in Riken contributed to 
prepare mouse tissues. 

Please visit our web site for further details. 
URL : http : / / genome . gsc . riken . go . j p/ 
URL : http : / / fantom. gsc . riken .go . jp/ . 
FEATURES Location/Qualifiers 
source 1. .1377 

/organism="Mus musculus" 

/mol_type=" mRNA 11 

/strain="C57BL/6J" 

/ db_x r e f = " FANT0M_DB :4930433N18" 

/db_xref ="MGI : 1894867" 

/db_xref ="taxon: 10090" 

/clone="4 93 0433N18" 

/sex=" male" 

/tissue__type="testis" 

/clone_lib="RIKEN full-length enriched mouse cDNA library" 
/ dev_s tage= " adult " 
CDS 287. .1300 

/not e= "unnamed protein product; M025-LIKE PROTEIN homolog 

[Homo sapiens] (SWISSPROT | Q9H9S4 , evidence: FASTY, 

98.2%ID, 100%length, match=1002) 

putative" 

/ codon_start=l 

/protein_id="BAC36470 . 1" 

/db_xref="GI : 26345638" 

/db_xref ="MGI : 1914081" 



/ trans la tion= "MKKMPLFSKSHKNPAEI VKILKDNLAILEKQDKKTDKASEEVSK 
SLQAMKEILCGTNDKEPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFN 
NI LRRQIGTRCPTVEYI SSHPHI LFMLLKGYEAPQIALRCGIMLRECIRHEPLAKI I L 
FSNQFRDFFKYVELSTFDIASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQ 
SENYVTKRQSLKLLGELILDRHNFTIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHV 
FKVFVASPHKTQP I VEI LLKNQPKLI EFLSSFQKERTDDEQFADEKNYLI KQI RDLKK 
AAP" 

BASE COUNT 430 a 294 c 306 g 347 t 

ORIGIN 

Query Match 84.7%; Score 858.8; DB 11; Length 1377; 

Best Local Similarity 90.4%; Pred. No. 1.3e-175; 

Matches 917; Conservative 0; Mismatches 97; Indels 0; Gaps 0; 
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181 


Db 


467 


Qy 


241 


Db 


527 


Qy 


301 


UD 


587 


Qy 


361 


Db 


647 


Qy 


421 


Db 


707 


Qy 


481 


Db 


767 


Qy 


541 


Db 


827 


Qy 


601 


Db 


887 



Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 
Db 

Qy 
Db 



661 AATTATGTTACTAAGAGACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCAC 720 

1 1 HIM M MINIM 1 1 1 ! 1 1 1 f I MINI I M 1 1 1 1 1 1 M II 1 1 M Ml 

94 7 AACTATGTGACAAAGAGACAATCTTTAAAGTTGCTAGGTGAGCTGATCCTGGACCGCCAC 1006 



721 



780 



AACTTTGCC^TCATGACAAAGTATATCAGCAAGCCGGAGAACCTGAAACTCATGATGAAC 

M M MM Mill 1 1 1 1 1 1 1 1 1 f 1 1 1 1 1 1 1 1 1 1 1 1 1 1 f 1 1 1 1 1 1 MINIM 

1007 AATTT CAC CATTATGACCAAGTATATCAG CAAG CCAGAGAACCTGAAACTGATGATGAAC 1066 



840 



781 CTCCTTCGGGATAAAAGTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTT 

M Mill II .IN, || MINIM Mill MINIMUM 

1067 CTGCTTCGAGACAAAAGTCCCAACATCCAATTCGAAGCCTTCCATGTCTTTAAGGTGTTT 1126 



841 GTGGCCAGTCCTCACAAAA(^CAGCCTATTGTGGAGATCCTGTTAAAAAATCA,GCCC7UVA 

1 1 N 1 1 1 1 N M 1 1 1 II I i I IIIINN M 1 1 M M 1 1 M 1 1 1 1 II 1 1 

1127 GTGGCCAGCCCCCACAAAACGCAGCCTATCGTGGAGATTCTGTTAAAAAATCAGCCCAAA 



900 



1186 



901 



960 



CTCATTGAGTTTCTGAGCAGCTTCCAAAAAGAAAGGACGGATGATGAGCAGTTCGCTGAC 

INIIIIIIIII 1 1 1 1 1 1 [ 1 1 J II INNIIMN II II IIIINN MINI 

118 7 CT CATTGAGTTTTTGAG CAG CTTT CAGAAAGAAAGGACAGACGACGAG CAGTTTG CTGAC 124 6 



961 



1247 



GAGAAGAACTACTTGATTAAACAGATCCGAGACTTGAAGAAAACGGCCCCTTGA 1014 

Ml III II 1 1 1 II 1 1 1 II I II II MM II Ml I Mill III 

GAGAAG AA CTA C C TGATTAAA CAGATT CGAGA CTTGAAGAAAG CAG C C C CG TGA 13 00 



RESULT 5 
AK013205 
LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 
MEDLINE 
PUBMED 

REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 



AK013205 1449 bp mRNA linear HTC 05-DEC-2002 

Mus musculus 10, 11 days embryo whole body cDNA, RIKEN full-length 
enriched library, clone : 2810430N08 product : M02 5 -LIKE PROTEIN 
homolog [Homo sapiens] , full insert sequence. 
AK013205 

AK013205 .1 GI : 12850419 
HTC; CAP trapper. 
Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodent ia; Sciurognathi ; Muridae; Murinae; Mus. 
1 

Carninci,P. and Hayashizaki , Y . 

High-efficiency full-length cDNA cloning 

Meth. Enzymol . 303, 19-44 (1999) 

99279253 

10349636 

2 

Carninci,P., Shibata,Y., Hayatsu,N., Sugahara,Y. , Shibata,K., 

Itoh,M., Konno,H., Okazaki,Y., Muramatsu,M. and Hayashizaki, Y. 

Normalization and subtraction of cap-trapper-selected cDNAs to 

prepare full-length cDNA libraries for rapid discovery of new genes 

Genome Res. 10 (10), 1617-1630 (2000) 

20499374 

11042159 

3 

Shibata,K., Itoh,M., Aizawa,K., Nagaoka,S., Sasaki, N. , Carninci,P., 
Konno,H., Akiyama,J., Nishi,K., Kitsunai,T., Tashiro,H., Itoh,M., 
Sumi,N., lshii,Y., Nakamura,S., Hazama,M., Nishine,T., Harada,A., 



Yamamoto, R. , Matsumoto, H . , Sakaguchi , S . , Ikegami,T., Kashiwagi , K. , 
Fuj iwake,S. , Inoue,K., Togawa,Y., Izawa,M., Ohara,E., Watahiki,M., 
Yoneda,Y., Ishikawa,T., Ozawa,K., Tanaka # T. # Matsuura , S . , Kawai,J., 
Okazaki,Y., Muramatsu, M . , Inoue,Y., Kira,A. and Hayashizaki , Y . 

TITLE RIKEN integrated sequence analysis (RISA) system--384-format 

sequencing pipeline with 384 mult icapillary sequencer 

JOURNAL Genome Res. 10 (11), 1757-1771 (2000) 

MEDLINE 20530913 
PUBMED 11076861 
REFERENCE 4 

AUTHORS Kawai,J., Shinagawa , A . , Shibata,K., Yoshino, M . , Itoh,M. , Ishii,Y., 
Arakawa,T., Hara,A., Fukunishi , Y . , Konno # H., Adachi,J., Fukuda,S., 
Aizawa,K., Izawa,M., Nishi,K., Kiyosawa,H., Kondo,S., Yamanaka , I . , 
Saito,T. f Okazaki,Y., Gojobori,T. , Bono,H., Kasukawa,T., Saito # R., 
Kadota,K., Matsuda # H. # Ashburner , M . , Batalov,S., Casavant,T., 
Fleischmann,W. , Gaasterland, T. , Gissi,C, King,B., Kochiwa,H., 
Kuehl,P., Lewises., Matsuo,Y., Nikaido,I., Pesole,G. , 
Quackenbush, J. , Schriml , L . M . , Staubli,F., Suzuki, R. , Tomita f M. # 
Wagner, L. , Washio,T., Sakai # K. # Okido,T., Furuno,M. , Aono # H. # 
Baldarelli,R. , Barsh,G. , Blake, J., Boffelli,D., Bojunga,N. , 
Carninci,P., de Bona 1 do , M . F . , Brownstein, M. J . , Bult,C., 
Fletcher, C. , Fujita,M., Gariboldi , M . , Gustincich, S . , Hill,D., 
Hofmann,M., Hume , D . A . , Kamiya,M., Lee , N . H . , Lyons, P., 
Marchionni,L. , Mashima, J. , Mazzarell i , J . , Mombaerts , P . , Nordone,P., 
Ring,B., Ringwald,M., Rodriguez , I . , Sakamoto, N. , Sasaki ,H., 
Sato,K., Schonbach, C. , Seya,T., Shibata,Y., Storch,K.F., Suzuki, H., 
Toyo-oka,K., Wang,K.H., Weitz,C, Whittaker , C . , Wilming, L. , 
Wynshaw-Boris,A. , Yoshida,K. , Hasegawa,Y., Kawaji,H., Kohtsuki,S. 
and Hayashizaki, Y. 

TITLE Functional annotation of a full-length mouse cDNA collection 

JOURNAL Nature 409 (6821) , 685-690 (2001) 

MEDLINE 21085660 
PUBMED 11217851 
REFERENCE 5 

AUTHORS The FANTOM Consortium and the RIKEN Genome Exploration Research 
Group Phase I & II Team. 

TITLE Analysis of the mouse transcriptome based on functional annotation 

of 60,770 full-length cDNAs 

JOURNAL Nature 420, 563-573 (2002) 
REFERENCE 6 (bases 1 to 144 9) 

AUTHORS. Adachi,J., Aizawa,K., Akahira,S. 

Arakawa,T., Bono,H., Carninci,P. 
Furuno , M . , Hanagaki ,T. , Hara , A . , 
Hiraoka,T., Hori,F., Imotani,K., 
Kasukawa # T., Kato,H w Kawai,J. 
Koya,S., Kurihara,C, Matsuyama , T. , Miyazaki,A., Nishi,K., 
Nomura, K. , Numazaki,R., Ohno,M. , Okazaki,Y., Okido,T., Owa,C, 
Saito,H., Saito,R., Sakai,C, Sakai,K., Sano,H., Sasaki,D., 
Shibata,K., Shibata,Y., Shinagawa , A. , Shiraki,T., Sogabe,Y w 
Suzuki, H., Tagami,M., Tagawa,A. , Takahashi , F . , Tanaka,T., 
Tejima,Y., Toya,T., Yamamura , T.<, Yasunishi , A. , Yoshida,K., 
Yoshino, M., Muramatsu,M. and Hayashizaki , Y . 

TITLE Direct Submission 

JOURNAL Submitted ( 10- JUL-2 000 ) Yoshihide Hayashizaki, The Institute of 
Physical and Chemical Research (RIKEN) , Laboratory for Genome 
Exploration Research Group, RIKEN Genomic Sciences Center (GSC) , 
RIKEN Yokohama Institute; 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, 



Akimura,T., Arai,A., Aono,H. 
Fukuda , S . , Fukunishi , Y . , 
, Hayatsu,N., Hiramoto,K., 

Ishii,Y., Itoh,M., Izawa,M., 
Kojima,Y w Konno,H., Kouda,M., 



Kanagawa 230-0045, Japan (E-mail : genome- res@gsc . riken. go . jp, 
URL: http://genome.gsc. riken.go.jp/, Tel : 8 1-4 5 -503 -9222 , 
Fax:81-45-503-9216) 

COMMENT Please visit our web site (http://genome.gsc.riken.go.jp/) for 

further details. 

cDNA library was prepared and sequenced in Mouse Genome 
Encyclopedia Project of Genome Exploration Research Group in Riken 
Genomic Sciences Center and Genome Science Laboratory in RIKEN . 
Division of Experimental Animal Research in Riken contributed to 
prepare mouse tissues. First strand cDNA was primed with a primer 

prepared by using trehalose thermo -activated reverse transcriptase 
and subsequently enriched for full-length by cap-trapper. cDNA went 
through one round of normalization to Rot = 7.5 and subtraction to 
Rot = 37.5. Second strand cDNA was prepared with the primer adapter 
of sequence [5' 

GAGAGAGAGATTCTCGAGTTAATTAAATTAATCCCCCCCCCCCCC 3']. cDNA was cleaved 
with Xhol and SstI . Cloning sites, 5' end: Xhol ; 3' end: SstI . 
Host: S0LR. 
FEATURES Location/Qualifiers 
source 1. .1449 

/organism="Mus musculus" 

/mol_type=" mRNA " 

/strain="C57BL/6J" 

/ db_xr e f = 11 FANT0M_DB :2810430N08" 

/db_xref = n MGI : 1893512" 

/db_xref ="taxon: 10090" 

/clone=" 2 81043 0N08" 

/tissue_type="whole body" 

/clone_lib= lf RIKEN full-length enriched mouse cDNA library" 
/dev_stage="10, 11 days embryo" 
misc_feature 281. .1292 

/note-"M025-LIKE PROTEIN homolog [Homo sapiens] 
(SWISSPROT|Q9H9S4, evidence: FASTY, 98.2%ID, 100%length, 
match-1002) 
putative" 

/db_xref =»MGI : 1914081" 
BASE COUNT 453 a 304 c 325 g 367 t 

ORIGIN 



Query Match 83.3%; Score 844.8; DB 11; Length 1449; 

Best Local Similarity 90.2%; Pred . No. 1.4e-172; 

Matches 915; Conservative 0; Mismatches 97; Indels 2; Gaps 1; 



Qy 


1 


ATGAAAAAAATGCCTTTGTTTAGTAAATCACACAA 

llllllllllllll IIIIIIIIIIIIMIIIIIIIIIIIIIIIMIIIII Mill 

ATGAAAAAAATGCCCTTGTTTAGTAAATCACACAAAAATC 


60 


Db 


281 


340 


Qy 


61 


CTGAAAGACAATTTGGCCATTTTGGAAAAGCAAGACAAAAAG^ 

IMIIIMIII 1 1 II 1 1 1 III IMII 1 1 1 III Ml M Ml II 1 1 1 II 1 II 1 II INI 

CTGAAAGACAAC CTGG CCATTTTGGAAAAG CAAGAQyU^AAGACAGACAAGG CTTCAGAA 


120 


Db 


341 


400 


Qy 


121 


GAAGTGT CTAAAT CACTG CAAG CAATGAAAGAAATTCTGTGTGGTACAAACGAGAAAGAA 

II Mill IMII llllllllllllll MIMIMMMM II IMII II II 

GAGGTGTCAAAATCTCTGCAAGCAATGAAGGAAATTCTGTGTGGAACGAACGACA^ 


180 


Db 


401 


460 



Qy 



181 CCCCCAACAGAAGCAGTGGCTCAGCTAGCACAAGAACTCTACAGCAGTGGCCTGCTAGTG 24 0 



Db 


461 


Qy 


241 


Db 


521 


Qy 


301 


Db 


581 


Qy 


361 


Db 


641 


Qy 


421 


Db 


701 


Qy 


481 


Db 


761 


Qy 


541 


Db 


819 


Qy 


601 


Db 


879 


Qy 


661 


Db 


939 


Qy 


721 


Db 


999 


Qy 


781 


Db 


1059 


Qy 


841 


Db 


1119 


Qy 


901 


Db 


1179 


Qy 


961 


Db 


1239 



Mill II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 M II II || IIIIMIMI M MM Ml 

CCC CCTACAGAAGCAGTGGCTCAGCTGGCGCAGGAG CT CTACAG CAG CGGGTTGCTGGTG 
ACACTGATAG CTGAC CTG CAG CTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATA 

Mill MIMIIIIMIMIM 1 1 1 1 ! 1 1 1 1 1 1 1 1 M II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

ACACTCATAGCTGACCTGCAGCTCATAGACTTTGAGGGAAAAAGAGATGTGACCCAGATA 
TTTAACAACATCTTGAGAAGACAGATAGG CACTCGGAGT CCTACTGTGGAGTATATTAGT 

M M 1 1 M 1 1 1 IIIIIIIIMIII II II III IIIIMIMI Mill II Ml 



GCTCATCCTCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTA 42 0 

IIIIMIMI MINIMUM: 111:11 1 1 I II 1 1 II 1 1 II 1 1 II 1 1 1 1 II II I 

TCTCATCCTCACATCCTGTTTATGCTTCTCAAAGGCTATGAAGCCCCACAGATTGCCTTA 700 



II II I III Mill Ml Mill IIIIMIMI I III II IN I MINI I Ml Ml I 

CGCTGTGGGATTATGCTAAGAGAGTGTATTCGACATGAGCCACTTGCCAAAATCATCCTA 76 0 
TTTTCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCT 54 0 

IIIIMIMI I I! II I MM II III MM II 1 1 1 1 1 i 1 1 III 

TTTTCTAATCAGTTCAGAGATTTCTTCAAGT - - GTTGAGCTGTCCACCTTTGATATCGCT 818 
TCAGATGCCTTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGAC 600 

M II 1 1 II I II llllllll MIIIMI I M I M 1 1 M 1 1 ; 1 1 II I 

TCAGATGCCTTCGCTACTTTTAAGGATTTGTTAACCAGACATAAAGTATTGGTAGCAGAC 878 
TTCTTAGAACAAAATTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAG 660 

M M 1 1 II II 1 1 1 1 1 1 1 MMMMMIMMIMMMIMM MM II IMIM 

TTCTTAGAACAAAATTATGACACTATTTTTGAAGACTATGAGAAACTGCTGCAATCTGAG 938 
AATTATGTTACTAAGAGACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCAC 72 0 

M Mill II llllllll IMIMIII IMIIII IMMMMIMIMM III 

AACTATGTGACAAAGAGACAATCTTTAAAGTTGCTAGGTGAGCTGATCCTGGACCGCCAC 998 
AACTTTGCCATCATGACAAAGTATATOVGCAAGCCGGAGAACCTGAAACTCATGATGAAC 78 0 

M II MM Mill 1 1 1 1 ! I ! I : I ! 1 1 1 1 1 1 1 1 1 1 1 Mill! 1 1 IMIMIII 

AATTTCACC^TTATGACCAAGTATATCAGCAAGCCAGAGAACCTGAAACTGATGATGAAC 1058 
CTCCTTCGGGATAAAAGTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTTT 84 0 

M MM! II M 1 1 1 II 1 1 1 1 M M 1 1 II MIIIMI Mill MINIUM 

CTGCTTCGAGACAAAAGTCCCAACATCCAATTCGAAGCCTTCCATGTCTTTAAGGTGTTT 1118 
GTGGCCAGTCCTCACAAAACAC^GCCTATTGTGGAGATCCTGTTAAAAAATCAGCCCAAA 900 

MIIIMI II llllllll llllllll MIIIMI 1 1 1 1 1 1 1 1 1 II I II II II I M 

GTGGCCAGCCCCCACAAAACGCAGCCTATCGTGGAGATTCTGTTAAAAAATCAGCCCAAA 1178 
CTCATTGAGTTTCTGAG CAG CTTC CAAAAAGAAAGGA CGGATGATGAG CAGTTCG CTGAC 960 

I M M M II 1 1 II II I II I M 1 1 II 1 1 II 1 1 II I II II II llllllll IMIM 

CTCATTGAGTTT CTGAG CAG CTTTCAGAAAGAAAGGACAGACGA CGAGCAGTTTG CTGAC 123 8 

GAGAAGAACTACTTGATTAAACAGATCCGAGACTTGAAGAAAACGGCCCCTTGA 1014 

M M I M II I I I II II I I I I I II I I MMMMIIIMM I Mill III 

GAGAAGAACTACCTGATTAAACAGATTCGAGACTTGAAGAAAGCAGCCCCGTGA 12 92 



RESULT 6 



AU125107 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 



JOURNAL 
COMMENT 



FEATURES 

source 



BASE COUNT 
ORIGIN 



AU125107 822 bp mRNA linear EST 01-AUG-2002 

AU125107 NT2RM4 Homo sapiens cDNA clone NT2RM4001047 5', mRNA 
sequence . 
AU125107 

AU125107.1 GI : 1094 9823 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 822) 

Ota,T., Wakamatsu,A. , Ozawa,M., Ishii,S. , Saito,K., Yamamoto, J. , 
Nakamura , Y . , Nishikawa,T. , Nagai,T., Suzuki ,Y., Sugano,S. and 
Isogai , T. 

HRI human cDNA project (Ota,T., Wakamatsu,A. , Ozawa,M., Ishii,S., 

Saito,K., Yamamoto,J., Nakamura # Y W Nishikawa , T. , Nagai,T., Suzuki 

,Y., Sugano,S., Isogai, T.) 

Unpublished 

Contact: Takao Isogai 

Genomics Laboratory 

Helix Research Institute 

1532-3 Yana, Kisarazu, Chiba 292-0812, Japan 
Tel: 81-438-52-3975 
Fax: 81-438-52-3986 
Email: genomics@hri.co.jp 

HRI human cDNA project; 5'- & 3 ' -end one pass sequencing: Helix 
Research Institute; cDNA library construction: Department of 
Virology, Institute of Medical Science, University of Tokyo, and 
Helix Research Institute. 

Loca t ion/Qual i f iers 

1. .822 

/organism="Homo sapiens" 

/mol_type=" mRNA" 

/db_xref ="taxon: 9606" 

/clone- "NT2RM4 001047" 

/ eel l_type= " teratocarcinoma " 

/cell_line="NT2 " 

/clone_lib="NT2RM4 " 

/note= "Vector: pME18SFL3; mRNA from uninduced NT2 neuronal 
precursor cells" 
268 a 164 c 171 g 216 t 3 others 



Query Match 76.0%; Score 770.6; DB 9; Length 822; 

Best Local Similarity 98.5%; Pred. No. 1.5e-156; 
Matches 798; Conservative 0; Mismatches 10; Indels 



2 ; Gaps 



2; 



Qy 
Db 

Qy 
Db 



19 TTTAGTAAATCACA.CAAAAATCCAGCAGAAATTGTGA 78 

1 1 1 M 1 1 1 1 M 1 1! 1 1 1 ; 1 1 i li 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 M 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 

1 TTTAGTAAATCACACAAAAATCCAGC^ 60 
79 ATTTTGGAAAAGCAAGACAAAAAGACAGACAAGGCTTCAGAAGAAGTGTCT 138 

I i M 1 1 1 i 1 1 1 1 1 II 1 1 1 1 ! 1 1 1 M 1 1 1 1 ! II 1 1 1 M 1 1 1 1 1 1 M 1 1 1 IN I i 1 1 1 II 1 1 

61 ATTTTGGAAAAGCAAGACAAAAAGACAGACAAGGCTT(^GAAGAAGTGTCTAAATCACTG 12 0 



Qy 13 9 CAAG CAATGAAAGAAATT CTGTGTGGTACAAACGAGAAAGAACCC CCAACAGAAGCAGTG 198 



Db 


121 


Qy 


199 


Db 


181 


Qy 


259 


Db 


241 


Qy 


319 


Db 


301 


Qy 


379 


Db 


361 


Qy 


439 


Db 


421 


Qy 


499 


Db 


481 


Qy 


559 


Db 


541 


Qy 


619 


Db 


601 


Qy 


679 


Db 


661 


Qy 


739 


Db 


721 


Qy 


799 


Db 


779 



II 1 1 1 M M 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1! 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M I 

CAAGCAATGAAAGAAATTCTGTGTC 18 0 

GCTCAGCTAGCACAAGAACTCTACAGCAGTGGCCTGCTAGTGACACTGATAGCTGACCTG 2 58 

1 1 ill 1 1 1 1 1 II Ml 1 1 1 1 II 1 1 1 1 1 1 II I II 1 1 il 1 1 II I II 1 1 1 1 1 1 II II I II I 

GCTCAGCTAGCACAAGAA.CTCTACAGCAGTGGCCTGCTGGTGACACTGATAGCTGACCTG 24 0 



1 1 Ml 1 1 1 1 1 Mi II 1 1 1 1 II M 1 1 1 1 II 1 1 1 II I II II MINI II 1 1 1 III llllll 



AGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGTGCTCATCCTCATATCCTG 378 

1 1 II II 1 1 1 Ml 1 1 1 M I < 1 1 II 1 1 1 III 1 1 1 M I II 1 1 1 1 IMI 1 1 1 1 1 II I MUM 

AGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGTGCTCATCCTCATATCCTG 3 60 
TTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTACGTTGTGGGATTATGCTG 438 

Ml I Ml MM MM II Ml II III MMM M M II I II I M II II II MMM II M I 

TTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTACGTTGTGGGATTATGCTG 420 
AGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTCTTTTCTAATCAATTCAGA 4 98 

1 1 1 1 1 II I II 1 1 1 1 1 1 III I! I II M 1 1 MM 1 1 1 M 1 1 1 1 1 1 1 1 1 1 Ml Ml II 1 1 1 1 

AGAGAATGTATTCGACATGAACCACTTGTG^ 4 80 

GATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCTTCAGATGCCTTTGCTACT 558 

1 1 Ml 1 1 1 1 1 Ml I II Ml II 1 1 1 1 1 1 1 1 Ml 1 1 1 M 1 1 1 M 1 1 II I III III 1 1 1 1 1 

GATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCTTCAGATGCCTTTGCTACT 540 
TTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGACTTCTTAGAACAAAATTAC 618 

1 1 M I II II llllll M I II I M II I II I II II I MIMM II 1 1 II II I II I II II 1 1 

TTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGACTTCTTAGAACAAAATTAC 600 

GACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAGAATTATGTTACTAAGAGA 678 

I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | | | | M I I I I I I I I I 

GACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAGAATTATGTTACTAAGAGA 660 

CAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCACAACTTTGCCATCATGACA 73 8 

M I M I ! M II M II M MM I M MM 1 1 M II 1 1 II M II II M I M Ml II II I 

CAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCACAACTTTGCCATCATGACA 72 0 
AAGTATATCAGC7VAGCCGGAGAACCTGAAACTCATGATGAACCTCCTTCGGGATAAAAGT 7 98 

! I M I M ,i :! MM M M lllllllll Mill 



lllllllllllllllll Mill 



RESULT 7 
BX393735 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 



BX393735 1201 bp mRNA linear EST 13-MAY-2003 

BX393735 Homo sapiens NEUROBLASTOMA COT 2 5 -NORMALI ZED Homo sapiens 
cDNA clone CS0DC002YI01 5-PRIME, mRNA sequence. 
BX393735 

BX393735. 1 GI : 30624 044 
EST. 

Homo sapiens (human) 



ORGANISM Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
REFERENCE 1 (bases 1 to 12 01) 

AUTHORS Li,W.B., Gruber,C, j e ssee,J. and Polayes,D. 
TITLE Full-length cDNA libraries and normalization 

JOURNAL Unpubl i shed 
COMMENT Contact : Genoscope 

Genoscope - Centre National de Sequencage 

BP 191 91006 EVRY cedex - France 

Email: seqref ^genoscope . ens . f r, Web : www.genoscope.cns.fr 
Library was constructed by Life Technologies, a division of 
Invitrogen. This sequence belongs to sequence cluster 6951. r For 
more information about this cluster, see 
http: / /www. genoscope . ens . fr/ 

cgi-bin/cluster. cgi?seq=CS0DC002AE01QPl&cluster=6951.r. Contact : 
Feng Liang Email : fliang@lifetech.com URL : 

http://fulllength.invitrogen.com/ InVitroGen Corporation 1600 
Faraday Avenue Genoscope sequence ID : CSODC002AE01QP1 . 
FEATURES Location/Qualifiers 
source 1. .1201 

/organism="Homo sapiens" 
/mol__type= " mRNA " 
/db_xref-"taxon: 9606" 
/clone-" CS0DC002YI01" 

/ t issue_type= "NEUROBLASTOMA COT 25 -NORMALIZED" 
/clone_lib="Homo sapiens NEUROBLASTOMA COT 25 -NORMALIZED" 
/note=" 1st strand cDNA was primed with a Not I -oligo (dT) 
primer. Five prime end enriched, double-strand cDNA was 
digested with Not I and cloned into the Not I and EcoR V 
sites of the pCMVSPORT 6 vector. Library was normalized." 

BASE COUNT 348 a 223 c 239 g 321 t 70 others 

ORIGIN 



Query Match 74.0%; Score 75 0.8; DB 13; Length 12 01; 

Best Local Similarity 91.0%; Pred. No. 3.1e-152; 

Matches 766; Conservative 45; Mismatches 27; Indels 4; Gaps 2; 



Qy 


i 


ATGAAAAAAATGCCTTTGTTTAGTAAATCACACAAAAATCC^GCAGAAATTGTGAAAATC 

MIIIIIIIIIMIIIIMIIIIhllllllllllllllMIIIIMIIIIIIIIIIMI 

ATGAAAAAAATGCCTTTGTTTAGTWAATCACACAAAAATCC^ 


60 


Db 


323 


382 


Qy 
Db 


61 
383 


CTGAAAGACAATTTGGCCATTTTGGAAAAGCAAGAC 

Ml M 1 !l 1 1 1 II 1 II 1 II 1 1 1 , 1 1 1 1 1 M Ml MM | M 1 1 1 1 II 1 1 II 1 1 1 1 1 1 Ml 

CTGAAAGACAATTTGGCCATTTTGGAAAAGCAAGACAAAAAGAC^ 


120 
442 


Qy 


121 


GAAGTGTCTAAATCACTG CAAG CAATGAAAGAAATT CTGTGTGGTACAAACGAGAAAGAA 

li ' .1 1 'II 1 IM MMI IIMIMI IMM II 1 

GWAGTGTCTWAWTCACTGCTAGCWATGWWAGATATTYTGTGTGGTACAWACGAGWAAGAT 


180 


Db 


443 


502 


Qy 


181 


CCCCCAACAGAAGCAGTGGCTCAGCTAGCACAAGAACTCTACAGCAGTGGC 

MIIIIIIIMIII illMIMM! Ilhllh MM 1 M IMIMIMM 

CCCCCAACAGAAGCTGTGGCTCAGCTTGCAYAAGWTYTCTTCAGYWGTTGCCTGCTAGTG 


240 


Db 


503 


562 


Qy 


241 


ACACTGATAG CTGACCTG CAG CTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATA 

II 1 1 1 1 1 Ml 1 IM II Ml M 1 II 1 IM II Mi MM 1 1 1 1 M M 1 II M 1 M 1 1 MM 

ACACTGATWGCTGACCTGCAGCTGATAGACTTTKAGGGAAAARDAGATGTGACCCAGATT 


300 


Db 


563 


622 



Qy 
Db 


301 
623 


TTTAACAACATCTTGAGAAGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGT 

Ml 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 M i M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 II ! 1 1 1 1 1 1 1 

TTTTACAACATCTTGAGAAGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGT 


360 
682 


Qy 

Db 


361 
683 


GCTCATCCTCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTA 

1 1 1 1 1 :i i ; 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 1 1 1 , 1 1 1 1 1 1 1 1 m 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 

GCTCATCCTCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTA 


420 
742 


Qy 

Db 


421 
743 


CGTTGTGGGATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTC 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

CGTTGTGGGATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTC 


480 
802 


Qy 


481 


TTTTCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCT 

Ml 1 M II 1 II M 1 II 1 M II 1 II MM MM II MM II 

TTTTCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCT 


540 


Db 


803 


862 


Qy 


541 


TCAGATGCCTTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGAC 

IMIIIIIIMIIIIIIMIMIIIMIhlMIIMIIMIIIIIIIIIIMIIIIIII 

T CAGATG CCTTTG CTACTTT CAAGGATTTW CTAACCAGACATAAAGTGTTGGTAG CAGAC 


600 


Db 


863 


922 


Qy 


601 


TTCTTAGAACAAAATTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAG 

IIIIIIIIIIIIIIIIIIIIIIIIMIIIh::||:||IMMIIIIIIII|||IMIII 
TTCTTAGAACAAAATTACGACACTATTTTTKWWGAYTATGAGAAATTGCTTCAGTCTGAG 


660 


Db 


923 


982 


Qy 


661 


AATTATGTTACTAAGAGACAGTCTTTAAAGCTGCTA - GGGGAGCTGATCCTGGACCGTCA 

1 1 - 1 ' 1 1 1 1 1 1 H 1 1 1 1 1 Ml M 1 1 1 1 1 1 1 

AATTATGTTACTAAGAGACAGTCTTTAAAGCTGCTHGGGGGRGCTGATCCTGGACCGTCA 


719 


Db 


983 


1042 


Qy 


720 


CAACTTTGCCATCATGACAAAGTATATCAGGAAGCCGGAGAACCTGAAACTCATGA 

HIM Ml Ml III III MM 11 = 111 = = 1 ==MII = = II = = = = 1 1 1 h | | | | M 1 
CAACTTTGCCATCATGACAAAGTWTATYMYCYMBCCGGSBYHCCYSWWACTCMTGATGAA 


779 


Db 


1043 


1102 


Qy 


780 


CCTCCTTCGGGATAAAAGTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTT 

M 1 = 1 Ml 11= II hhlllllllllllllMI = 1111= =11 II 
CCCCYTCGGGGTAAAR KCCQ^MAWCCAGTTTGAAGCCTTTWTKTTTTTWKGKGTTTT 


839 


Db 


1103 


1159 


Qy 


840 


TG 841 




Db 


1160 


II 

TG 1161 





RESULT 8 
AK005323 
LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



AK005323 1379 bp mRNA linear HTC 05-DEC-2002 

Mus musculus adult male cerebellum cDNA, RIKEN full-length enriched 
library, clone : 1500031K13 product : M02 5 -LIKE PROTEIN homolog [Homo 
sapiens] , full insert sequence. 
AK005323 

AK005323 .1 GI : 12837793 
HTC; CAP trapper. 
Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Rodent ia; 
1 

Carninci,P. and Hayashizaki , Y . 



Craniata ; Vertebra ta ; Euteleostomi ; 
Sciurognathi; Muridae; Murinae; Mus 



TITLE 
JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 



TITLE 
JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
REFERENCE 
AUTHORS 



High-efficiency full-length cDNA cloning 

Meth. Enzymol. 303, 19-44 (1999) 

99279253 

10349636 

2 

Carninci,P., Shibata,Y., Hayatsu,N., Sugahara,Y., Shibata,K., 
Itoh,M. # Konno,H., Okazaki,Y., Muramatsu,M. and Hayashizaki , Y. 
Normalization and subtraction of cap-trapper-selected cDNAs to 
prepare full-length cDNA libraries for rapid discovery of new genes 



Genome Res 
20499374 
11042159 
3 

Shibata, K 



10 (10) , 1617-1630 (2000) 



Itoh,M. , Aizawa , K . 
Konno,H., Akiyama,J., Nishi,K. 
Sumi,N., Ishii,Y., Nakamura , S . 



Ishii , Y . , 
Fukuda , S . , 



Nagaoka,S., Sasaki, N w Carninci,P., 
Kitsunai,T., Tashiro,H w Itoh,M. , 
Hazama,M., Nishine,T., Harada,A. , 
Yamamoto,R., Matsumoto, H . , Sakaguchi , S . , Ikegami,T., Kashiwagi , K. , 
Fuj iwake , S . , Inoue,K., Togawa,Y., Izawa,M., Ohara,E., watahiki,M., 
Yoneda,Y., Ishikawa,T. , Ozawa,K., Tanaka,T., Matsuura, S . , Kawai,j'. , 
Okazaki,Y. # Muramatsu,M. , Inoue,Y., Kira,A. and Hayashizaki , Y . 
RIKEN integrated sequence analysis (RISA) system--384 -format 
sequencing pipeline with 384 multicapillary sequencer 
Genome Res. 10 (11), 1757-1771 (2000) 
20530913 
11076861 
4 

Kawai,J., Shinagawa,A. , Shibata, K., Yoshino,M. , Itoh,M. 
Arakawa,T., Hara,A., Fukunishi , Y . , Konno,H. , Adachi # J. # 
Aizawa, K., Izawa,M., Nishi,K., Kiyosawa,H., Kondo,S., Yamanaka,I., 
Saito,T., Okazaki,Y., Gojobori,T., Bono,H., Kasukawa,T., Saito,R., 
Kadota,K. , Matsuda,H., Ashburner , M . , Batalov,S., Casavant,T., 
Fleischmann,W. , Gaasterland, T. , Gissi,C, King,B., Kochiwa,H., 
Kuehl,P., Lewis, S., Matsuo,Y., Nikaido,I., Pesole,G., 
Quackenbush, J. , Schriml , L . M . , Staubli,F w Suzuki ,R., Tomita,M., 
Wagner, L. , Washio,T., Sakai,K., Okido,T. , Furuno,M., Aono,H., 
Baldarelli,R. , Barsh,G. , Blake, J., Boffelli,D., Bojunga,N. , 
Carninci,P., de Bonaldo , M . F . , Browns tein, M . J . , Bult,C, 
Fletcher,C, Fujita,M., Gariboldi,M. , Gust incich, S . , Hill,D. # 
Hofmann,M., Hume , D . A . , Kamiya,M. , Lee, N . H . , Lyons, P., 
Marchionni,L. , Mashima,J., Mazzarelli , J. , Mombaerts , P . , Nordone,P., 
Ring,B., Ringwald,M., Rodriguez , I . , Sakamoto, N. , Sasaki, H. , 
Sato,K., Schonbach,C. , Seya,T., Shibata, Y., Storch,K.F., Suzuki, H. ( 
Toyo-oka,K., Wang,K.H., Weitz,C, Whittaker , C. , Wilming,L., 
Wynshaw-Boris,A. , Yoshida,K., Hasegawa,Y., Kawaji,H., Kohtsuki,S. 
and Hayashizaki, Y. 

Functional annotation of a full-length mouse cDNA collection 

Nature 409 (6821) , 685-690 (2001) 

21085660 

11217851 

5 

The FANTOM Consortium and the RIKEN Genome Exploration Research 
Group Phase I & II Team. 

Analysis of the mouse transcriptome based on functional annotation 
of 60,770 full-length cDNAs 
Nature 420, 563-573 (2002) 
6 (bases 1 to 1379) 

Adachi,J w Aizawa, K. , Akahira,S., Akimura,T., Arai,A., Aono,H w 



Arakawa,T., Bono,H., Carninci,P., Fukuda,S., Fukunishi , Y . , 
Furuno,M., Hanagaki,T., Hara,A. , Hayatsu,N. , Hiramoto, K. , 
Hiraoka,T., Hori,F., Imotani # K. # Ishii,Y., Itoh,M. , Izawa,M., 
Kasukawa,T., Kato,H., Kawai,J., Kojima,Y., Konno,H., Kouda,M. , 
Koya,S., Kurihara,C, Matsuyama, T. , Miyazaki,A., Nishi,K w 
Nomura,K., Numazaki,R., Ohno,M., Okazaki,Y., Okido,T. , Owa,C, 
Saito,H. ; Saito,R., Sakai,C, Sakai,K., Sano,H., Sasaki, D., 
Shibata,K., Shibata,Y., Shinagawa,A. , Shiraki,T., Sogabe,Y w 
Suzuki ,H., Tagami,M., Tagawa,A. , Takahashi , F . , Tanaka,T., 
Tejima,Y., Toya,T. , Yamamura , T . , Yasunishi , A. , Yoshida,K., 
Yoshino,M., Murarnatsu, M . and Hayashizaki , Y . 
TITLE Direct Submission 

JOURNAL Submitted ( 10- JUL-2000) Yoshihide Hayashizaki, The Institute of 
Physical and Chemical Research (RIKEN) , Laboratory for Genome 
Exploration Research Group, RIKEN Genomic Sciences Center (GSC) , 
RIKEN Yokohama Institute; 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, 
Kanagawa 230-0045, Japan ( E -ma i 1 : genome - res@gsc . r iken . go . j p , 
URL : ht tp : / /genome . gsc . r iken .go.jp/, Tel : 81-45-503-9222, 
Fax:81-45-503-9216) 
COMMENT Please visit our web site (http://genome.gsc.riken.go.jp/) for 

further details. 

cDNA library was prepared and sequenced in Mouse Genome 
Encyclopedia Project of Genome Exploration Research Group in Riken 
Genomic Sciences Center and Genome Science Laboratory in RIKEN. 
Division of Experimental Animal Research in Riken contributed to 
prepare mouse tissues. First strand cDNA was primed with a primer 
[5' GAGAGAGAGAAGGAT C CAAGAG CTCTTTTTTTTTTTTTTTTVN 3'], cDNA was 
prepared by using trehalose thermo-activated reverse transcriptase 
and subsequently enriched for full-length by cap-trapper. Second 
strand cDNA was prepared with the primer adapter of sequence [5' 
GAGAGAGAGAGCGGCCGCAATTAATTCTCGAGTTAATTAAATTAATCCCCCCCCCCC 3 1 ] . cDNA 
was cleaved with Xhol and Sstl. Cloning sites, 5' end: Xhol; 3' 
end: Sstl. Host: S0LR. 
FEATURES Locat ion/Qua 1 i f i ers 

source 1. .1379 

/organism="Mus musculus" 

/mol_type="mRNA n 

/ s t ra in= 11 C5 7BL/ 6 J " 

/ db_xr e f = " FANT0M_DB :1500031K13 M 

/db_xref = "MGI : 1901050" 

/db_xref="taxon: 10090" 

/clone=" 150003 1K13" 

/sex=" male" 

/tissue_type=" cerebellum" 

/clone_lib="RIKEN full-length enriched mouse cDNA library" 
/dev_stage= "adult 11 
CDS 285. .1175 

/no te= "unnamed protein product; M025-LIKE PROTEIN homolog 

[Homo sapiens] (SWISSPROT | Q9H9S4 , evidence: FASTY , 

98.2%ID, 100%length, match=1002) 

putative" 

/codon_start-l 

/protein_id="BAB23953 . 2 " 

/db_xref="GI : 26342524" 

/db_xref ="MGI : 1916258" 

/ 1 rans 1 a t i on = " MKKMPLFS KSH KN PAE I VKI LKDNLA I LEKQDKKTDKASEEVS K 
PLQAMKEI LCGTNDKEPPTEAVAQLAQELYSSGLLVTLI ADLQLI DFEGKKDVTQI FN 



N I LRRQ I GTRCPTVEY I SSHPHI LFMLLKGYEAPQI ALRCGI MLRECI RHEPLAKI I L 
FSNQFRDFFKYVELSTFDIASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQ 
S ENY VTKRQSLKLLGEL I LDRHNFT I MTKY I S KPENLKLMMNLLRDKS PN I QFEAFHV 
FKNSVF I TNR I HGLKRWLS S " 
polyA_signal 1361. .1366 

/not e= "putative" 
polyA_site 1379 

/not e= "putat ive " 

BASE COUNT 452 a 284 c 291 g 352 t 

ORIGIN 

Query Match 70.0%; Score 709.4; DB 11; Length 1379; 

Best Local Similarity 88.9%; Pred. No. 3e-143; 

Matches 767; Conservative 0; Mismatches 96; Indels 0; Gaps 0; 

ATGAAAAAAATGCCTTTGTTTAGTAAATCACACAAAAATCCAGCA 6 0 

M M 1 1 II 1 1 1 Ml II Ml M M 1 1 1 1 1 1 1 M MM MM I II M 1 1 1 1 Mill 

ATGAAAAAAATGCCTTTGTTTAGTAAATCACACAAAAATCCAGCA 344 
CTGAAAGACAATTTGGCCATTTTGGAAAAGCAAG 12 0 

'IN 1 1 , 1 1 1 1 1 1 1. Ml II 1 1 1 M 1 1 1 II II 1 1 1 1 II I II III II III 

CTGAAAGACAACCTGGCCATTTTGGAAAAGCAAGAC^^ 4 04 

GAAGTGTCTAAATCACTGCAAGCAATGAA 180 

M Mill Ml I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II Mill II II 

GAGGTGTCAAAA.CCTCTGCAAGCAATGAAGGA 4 64 

CCCCCAACAGAAGCAGTGGCTCAGCTAGCACAAGAACTC 24 0 

Mill I M M M I M M M M M M II II II MM II MM Ml 

CCCCCTACAGAAGCAGTGGCTCAGCTGGCGC^GGAGCTCTACAGCAGCGGGTTGCTGGTG 524 
ACACTGATAG CTGAC CTG CAG CTGATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATA 3 00 

MIM MMMMMMMMI I M M I II 1 1 1 1 1 II 1 1 1 1 M I II I II 1 1 1 1 M I II 

AC^CTCATAGCTGACCTGCAGCTCATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATA 584 
TTTAACAACATCTTGAGAAGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGT 360 

M MMIMM I II II Ml 1 1 1 1 1 1 1 1 1 1 Mill II Ml 

TTCAAGAA(^TCCTGAGAAGA(^GATTGGTACACGGTGTCCTACTGTCGAGTACATCAGT 644 
GCTCATCCTCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTA 42 0 

III Ml ,MMM M 1 1 1 1 M I Ml I Ml II I II 1 1 

TCT(^TCCTCACATCCTGTTTATGCTTCTO^VAGGCTATGAAGCCCCACAGATTGCCTTA 704 
CGTTGTGGGATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTC 48 0 

M MM II 1 1 1 1 1 1 1 1 1 1| | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

CGCTGTGGGATTATGCTAAGAGAGTGTATTCGACATGAGCCACTTGCCAAAATCATCCTA 764 
TTTTCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCT 54 0 

I M M M II 1 1 II II Mill M Ml 1 1 1 1 II 1 1 1 1 1 1 1 1 Ml 

TTTTCTAATCAGTTCAGAGATTTCTTCAAGTATGTTGAGCTGTCCACCTTTGATATCGCT 824 
TCAGATGCCTTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGAC 60 0 

I M II II II 1 1 1 1 II 1 1 II MMMM I Ml 1 1 II I II I II I MIM! I ill) I 

TCAGATG C CTT CGCTACTTTTAAGGATTTGTTAACCAGACATAAAGTATTGGTAGCAGAC 884 
TTCTTAGAACAAAATTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAG 660 

M M I M II 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 M I! 1 1 1 1 1 1 1 1 If 1 1 1 1 1 MM II MIMI 



Qy 


i 


Db 


285 


Qy 


61 


Db 


345 


Qy 


121 


Db 


405 


Qy 


181 


Db 


465 


Qy 


241 


Db 


525 


Qy 


301 


Db 


585 


Qy 


361 


Db 


645 


Qy 


421 


Db 


705 


Qy 


481 


Db 


765 


Qy 


541 


Db 


825 


Qy 


601 



Db 


885 


TTCTTAGAACAAAATTATGAO^CTATTTTTGAAGACTATGAGAAACTGCTGCAATCTGAG 


944 


Qy 


661 


AATTATGTTACTAAGAGACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCAC 

II INN M MINIM NIMH IMMM MMMMMMMMI III 

AACTATGTGACAAAGAGACAATCTTTAAAGTTGCTAGGTGAGCTGATCCTGGACCGCCAC 


720 


Db 


945 


1004 


Qy 


721 


AACTTTGC(^TCATGACAAAGTATATCAGCAAGCCGGAGAACCTGAAACTCATGATGAAC 

M II MM Mill MMMMMMMMI 1 1 1 1 1 1 1 1 1 1 1 1 1 1 MIMIIII 

AATTTCACCATTATGAC CAAGTATAT CAGCAAG CCAGAGAACCTGAAACTGATGATGAAC 


780 


Db 


1005 


1064 


Ov 


781 


PTPPTTPOfinATA A A AnTPPPA an\ 'TCCACTTTr'TS. AppPTTTPA r Pr"T"T"-p«-prp 7 \ Tipz-irrrimmm 

M IMM M MMMMMMMMI II IMMMI Mill MMM 1 1 

CTGCTTCGAGACAAAAGTCCCAACATCCAATTCGAAGCCTTCCATGTCTTTAAGAATTCT 


840 


Db 


1065 


1124 


Qy 


841 


GTGGCCAGTCCTCACAAAACACA 863 

M 1 1 1 1 1 II III 

GTCTTTATTACAAATAGAATACA 114 7 




Db 


1125 





RESULT 9 
BG218735 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
MEDLINE 
PUBMED 
COMMENT 



FEATURES 

source 



BG218735 784 bp mRNA linear EST 21-APR-2001 

RST38476 Athersys RAGE Library Homo sapiens cDNA, mRNA sequence. 
BG218735 

BG218735 .1 GI : 13744756 
EST. 

Homo sap i ens ( human ) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 784) 

Harrington, J.J. , Sherf,B., Rundlett # S . , Jackson, P . D . , Perry, R. , 
Cain,S., Leventhal , C . , Thornton, M. , Ramachandran, R . , Whittington, J. 
, Lerner,L., Costanzo,D., McElligott , K. , Boozer,S., Mays,R., Smith 
,E., Veloso,N., Klika,A., Hess, J., Cothren,K., Lo,K., Offenbacher 
,J., Danzig, J. and Ducar,M. 

Creation of genome-wide protein expression libraries using random 

activation of gene expression 

Nat. Biotechnol. 19 (5), 440-445 (2001) 

21227151 

11329013 

Contact: Scott J. Cain 
Athersys, Inc. 

3201 Carnegie Ave, Cleveland, OH 44115, USA 

Tel: 216 431 9900 

Fax: 216 361 9596 

Email : scain@athersys . com 

High quality sequence stop: 515. 

Location/Qualifiers 

1. .784 

/organism="Homo sapiens" 

/ mo 1 _ t yp e = 11 mRNA 11 

/db_xref-"taxon: 9606" 

/eel 1_1 ine= " HT1 08 0 " 

/clone_lib= "Athersys RAGE Library" 

/note="See 'Creation of Genome-wide Protein Expression 
Libraries using Random Activation of Gene Expression' , 



Nature Biotechnology, in press. Note that even though the 
cell type indicated is HT1080, since a random activation 
method was used, these sequence tags are not necessarily 
expressed in HT1080 under normal circumstances." 

BASE COUNT 265 a 151 c 158 g 209 t 1 others 

ORIGIN 

Query Match 66.3%; Score 671.8; DB 10; Length 784; 

Best Local Similarity 96.5%; Pred. No. 3.8e-135; 

Matches 718; Conservative 0; Mismatches 23; Indels 3; Gaps 3; 

ATGAAAAAAATGCCTTTGTTTAGTAA^ 6 0 

1 1 1 1 1 1 1 1 1 1 MM 1 1 1 1 1 III 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1| 1 1| ||! 1 1 1 1 1 1 1 inn 

ATGAAAAGAATGCCTTTGTTTAGTAAATCACACAAAAATCCAGCAGAAA 9 1 

CTGAAAGACAATTTGGCCATTTTGGAAAAGCAAGACAA^ 12 0 

jll ill I III I 'I' IMMIIIIIIIIIIIII MINIMUM 

CTGAAAGACAATTTGGCCATTTTGGAAAAG CAAGACAAAAAGACAGACAAGG CTT CAGAA 151 
GAAGTGTCTAAATCACTGCAAGCAATGAAAGAAATTCTGTGTGGTACAAACGAGAAAGAA 18 0 

M M MMMM MIM 1 1 1 1 MMI 1 1 II 1 1 1 1 1 MM 1 1 IMM II II 1 1 1 Ml I MM 

GAAGTGTCTAAATCACTGOyVGCAATGAAAGAAATTCTGTGTGGTACAAACGAGAAA 211 
CCCCCAACAGAAGCAGTGGCTCAGCTAGCAC^ 24 0 

M II 1 1 1 MM Ml II II II I MM M I Ml I M Ml III II Ml MMI 1 1 MM! 

CCCCCAACAGAAGCAGTGGCCCAGCTAGCAC^ 271 



Qy 


i 


Db 


32 


Qy 


61 


Db 


92 


Qy 


121 


Db 


152 


Qy 


181 


Db 


212 


Qy 


241 


Db 


272 


Qy 


301 


Db 


332 


Qy 


361 


Db 


392 


Qy 


421 


Db 


452 


Qy 


481 


Db 


512 


Qy 


541 


Db 


572 


Qy 


601 


Db 


632 


Qy 


660 



MMI I MM M 1 1 1 M I M I IMM 1 1 1 1 1 1 1 1 1 1 i II II Ml I MMI 1 1 1 1 Ml M I! 



300 
331 



TTTAACAACATCTTGAGAAGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGT 360 

1 1 1 1 M 1 1 1 1 1 II 1 1 [ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M I 

TTTAACAACATCTTGAGAAGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGT 3 91 
GCTCATCCTCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTA 4 2 0 

M III M I M M I M M I ! II M 1 1 IM 1 1 II 1 1 1 IMIM I Ml III M I III 1 1 II 

GCTCATCCTCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCCTA 4 51 
CGTTGTGGGATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTC 480 

MM 1 1 IM II 1 1 1 II 1 1 1 1 II l| M 1 1 1 1 1 1 1 1 1 Ml M M 1 1 IMM M I IM 1 1 III 

CGTTGTGGGATTATGCTGAGAGAATGTATTCGAC^TGAACCACTTGCCAAAATCATCCTC 511 
TTTTCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCT 54 0 

I IN II M MMMMMMMMMMMMMMMMMMM! 

TTTTCTAATCAATTCA.GAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCT 571 
TCAGATG C CTTTG CTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAG CAGAC 6 00 

"Ml N i IN IN ININI NN III INNININ 1 1 1 II ill NNI N I INN I IN 

T(^GATGCCTTTGCTACTTT(^GGATTTACTAACC^GACATAAAGTGTTGGTAGCAGAC 631 
TTCTTAGAACAAAATTACGACACTATTTTTGAAGACTATG - AGAAATTGCTTCAGTCTGA 65 9 

MINI I III I lllllllllllllll lllllllllllllllllll 

TTCTTAAACAAAATTACGACACTATTTTTTGAAGACTATGAAGAAATTGCTTCAGTCTGA 6 91 
GAATTATGTTAC - TAAGAGACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTC 718 

IMIM M 1 M I M M 1 1 1 1 1 1 1 1 1 1 1 1 1 i I ( 1 1 1 M M 1 1 1 1 1 i I 1 1 



Db 692 GAATTATGTTACTTAAGAGACAGTCTTTAGAGCTGCTAGGGGAGCTGATCCTGAAANGTT 751 



Qy 

Db 



719 ACAACTTTGCCATCATGACAAAGT 742 

Illllllllll IIIMIIIIIII 

752 ACAACTTTGCC - TCATGACAAAGT 7 74 



RESULT 10 

AK013161 

LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 
MEDLINE 
PUBMED 

REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 



AK013161 1281 bp mRNA linear HTC 05-DEC-2002 

Mus musculus 10, 11 days embryo whole body cDNA, RI KEN full-length 
enriched library, clone : 2810425013 product : M02 5 -LIKE PROTEIN 
homolog [Homo sapiens] , full insert sequence. 
AK013161 

AK013161. 1 GI : 1285035 0 
HTC; CAP trapper. 
Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodent ia; Sciurognathi ; Muridae; Murinae; Mus. 
1 

Carninci,P. and Hayashizaki , Y . 

High-efficiency full-length cDNA cloning 

Meth. Enzymol. 303, 19-44 (1999) 

99279253 

10349636 

2 

Carninci,P., Shibata,Y., Hayatsu,N., Sugahara,Y., Shibata,K w 
Itoh,M., Konno,H., Okazaki,Y., Muramatsu,M. and Hayashizaki, Y. 
Normalization and subtraction of cap-trapper-selected cDNAs to 
prepare full-length cDNA libraries for rapid discovery of new genes 
Genome Res 
20499374 
11042159 
3 

Shibata,K. 



10 (10), 1617-1630 (2000) 



Itoh,M. , Aizawa,K. 
Konno,H., Akiyama,J., Nishi,K. 
Sumi,N., Ishii,Y., Nakamura , S . 

Yamamoto , R . , Mat sumoto , H . , Sakaguchi , S . , Ikegami , T . 
Fujiwake,S. , Inoue,K., Togawa,Y., Izawa,M., Ohara,E 



Nagaoka , S . , Sasaki , N . , 
Kitsunai,T. , Tashiro,H 
Hazama,M. , Nishine,T. , 



Carninci, P. , 
, Itoh,M., 
, Harada , A. , 
Kashiwagi , K. , 
Watahiki,M. , 



Yoneda,Y., Ishikawa,T., Ozawa,K., Tanaka,T., Matsuura,S., Kawai,J., 
Okazaki,Y., Muramatsu, M. , Inoue,Y., Kira,A. and Hayashizaki, Y. 
RIKEN integrated sequence analysis (RISA) system- -384 - format 
sequencing pipeline with 384 mult icapillary sequencer 
Genome Res. 10 (11), 1757-1771 (2000) 
20530913 
11076861 
4 

Kawai,J., Shinagawa,A. , Shibata,K., Yoshino,M., Itoh,M. 
Arakawa,T., Hara,A., Fukunishi , Y . , Konno,H., Adachi,J., 
Aizawa,K., Izawa,M. , Nishi,K., Kiyosawa,H., Kondo,S., Yamanaka,I. 
Saito,T., Okazaki,Y., Gojobori,T., Bono,H., Kasukawa,T., Saito,R. 
Kadota,K., Matsuda,H., Ashburner,M. , Batalov,S., Casavant,T., 
Fleischmann, W. , Gaasterland, T. , Gissi,C, King,B., Kochiwa,H. , 
Kuehl,P., Lewis,S., Matsuo,Y., Nikaido,!., Pesole,G. , 
Quackenbush, J. , Schriml , L. M . , Staubli,F., Suzuki ,R., Tomita,M., 
Wagner, L . , Washio,T., Sakai,K., Okido,T., Furuno,M., Aono,H., 



, Ishii,Y., 
Fukuda , S . , 



TITLE 
JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
REFERENCE 
AUTHORS 



TITLE 
JOURNAL 



COMMENT 



Baldarelli,R. , Barsh,G. , Blake, J. , Boffelli,D., Bojunga,N. , 
Carninci,P., de Bonaldo, M . F . , Brownstein, M . J . , Bult,C, 
Fletcher, C, Fujita,M., Gariboldi , M . , Gustincich, S . , Hill,D., 
Hofmann,M., Hume , D . A . , Kamiya,M., Lee, N . H . , Lyons, P., 
Marchionni,L. , Mashima,J., Mazzarelli , J. , Mombaerts , P . , Nordone,P., 
Ring ,6., Ringwald,M. , Rodriguez , I . , Sakamoto , N. , Sasaki, H. , 
Sato,K., Schonbach,C. , Seya,T., Shibata,Y., Storch # K.F. # Suzuki,H., 
Toyo-oka,K., Wang,K.H., Weitz,C, Whittaker , C . , Wilming,L., 
Wynshaw-Boris,A. , Yoshida,K., Hasegawa,Y., Kawaji,H., Kohtsuki,S. 
and Hayashizaki, Y. 

Functional annotation of a full-length mouse cDNA collection 

Nature 4 09 (6821), 685-690 (2001) 

21085660 

11217851 

5 

The FANTOM Consortium and the RIKEN Genome Exploration Research 
Group Phase I & II Team. 

Analysis of the mouse transcriptome based on functional annotation 
of 60,770 full-length cDNAs 
Nature 420, 563-573 (2002) 
6 (bases 1 to 1281) 

Adachi,J., Aizawa,K., Akahira,S., Akimura,T., Arai,A., Aono,H., 
Arakawa,T., Bono,H., Carninci,P., Fukuda,S., Fukunishi , Y . , 
Furuno,M., Hanagaki,T., Hara,A. , Hayatsu,N., Hiramoto,K., 
Hiraoka,T., Hori,F., Imotani,K., Ishii,Y., Itoh,M. , Izawa,M., 
Kasukawa,T., Kato,H. , Kawai,J., Kojima,Y., Konno,H w Kouda,M., 
Koya,S w Kurihara,C, Matsuyama , T . , Miyazaki,A, , Nishi,K., 
Nomura, K. , Numazaki,R., Ohno,M., Okazaki,Y., Okido,T w Owa,C, 
Saito,H., Saito,R., Sakai,C, Sakai,K., Sano,H., Sasaki,D., 
Shibata,K., Shibata,Y., Shinagawa,A. , Shiraki,T., Sogabe,Y., 
Suzuki, H., Tagami,M., Tagawa,A., Takahashi , F . , Tanaka,T., 
Tejima,Y., Toya,T., Yamamura,T., Yasunishi , A . , Yoshida,K., 
Yoshino,M., Muramatsu,M. and Hayashizaki , Y . 
Direct Submission 

Submitted ( 10- JUL-2000) Yoshihide Hayashizaki, The Institute of 
Physical and Chemical Research (RIKEN) , Laboratory for Genome 
Exploration Research Group, RIKEN Genomic Sciences Center (GSC) , 
RIKEN Yokohama Institute; 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, 
Kanagawa 230-0045, Japan (E-mail : genome -res@gsc . riken . go . jp , 

URL:http: //genome. gsc.riken.go.jp/, Tel : 8 1-45-503 -9222 , 
Fax:81-45-503-9216) 

Please visit our web site (http://genome.gsc.riken.go.jp/) for 
further details. 

cDNA library was prepared and sequenced in Mouse Genome 
Encyclopedia Project of Genome Exploration Research Group in Riken 
Genomic Sciences Center and Genome Science Laboratory in RIKEN. 
Division of Experimental Animal Research in Riken contributed to 
prepare mouse tissues. First strand cDNA was primed with a primer 
[5' GAGAGAGAGAAGGAT CCAAGAG CTCTTTTTTTTTTTTTTTTVN 3'], cDNA was 
prepared by using trehalose thermo-activated reverse transcriptase 
and subsequently enriched for full-length by cap-trapper. cDNA went 
through one round of normalization to Rot =7.5 and subtraction to 
Rot = 37.5. Second strand cDNA was prepared with the primer adapter 
of sequence [5 1 

GAGAGAGAGATTCTCGAGTTAATTAAATTAATCCCCCCCCCCCCC 3 ' ] . cDNA was cleaved 
with Xhol and SstI . Cloning sites, 5' end: Xhol ; 3' end: SstI . 
Host: SOLR. 



FEATURES Location/Qualifiers 
source 1. .1281 

/organism="Mus musculus" 
/mol_type="mRNA" 
/strain="C57BL/6J" 
/ db_xr e f = " FANTOM_DB :2810425O13" 
/db_xref="MGI : 1908997" 
/db_xref ="taxon: 10090" 
/clone= "2810425013" 
/tissue_type="whole body" 

/clone_lib="RIKEN full-length enriched mouse cDNA library" 
/dev_stage="10, 11 days embryo" 
misc_f eature 289. .1096 

/note="M025-LIKE PROTEIN homolog [Homo sapiens] 
(SWISSPROT|Q9H9S4, evidence: FASTY, 98.2%ID, 100%length, 
match=1002) 
putative" 

/db_xref-"MGI : 1922871" 
BASE COUNT 383 a 275 c 291 g 332 t 

ORIGIN 



Query Match 61.4%; Score 622.8; DB 11; Length 1281; 

Best Local Similarity 82.8%; Pred. No. 1.7e-124; 

Matches 763; Conservative 0; Mismatches 97; Indels 62; Gaps 2; 



Qy 


93 


AGACAAAAAGACAGACAAGG CTTCAGAAGAAGTGTCTAAATCACTG CAAG CAATGAAAGA 

HI 1 1 M llllllllllllll Mill Mill 1 1 II II II 1 II II 1 II 

AGAAGA(^GGATTTCTAAGGCTTCAGAAGAGGTGTC^W^TCTCTG(^Ga^TGAAGGA 


152 


Db 


237 


296 


Qy 


153 


AATTCTGTGTGGTACAAACGAGAAAGAACCCCCAACAGAAGC^ 

II II Mill II II Mill 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M || 

AATTCTGTGTGGAACGAACGACAAGGAGCCCCCTACAGAAGC^GTGGCTCAGCTGGCGCA 


212 


Db 


297 


356 


Qy 


213 


AGAACTCTACAGCAGTGGCCTGCTAGTGACACTGATAGCTGACCTGCAGCTGATAGACTT 

M IIIIMIMII II MM IIIMIII 1 M 1 1 1 II 1 1 M 1 1 1 f 1 IMIIIII 

GGAGCTCTACAGCAGCGGGTTGCTGGTGACACTCATAGCTGACCTGCAGCTCATAGACTT 


272 


Db 


357 


416 


Qy 


273 


TGAGGGAAAAAAAGATGTGACCC^GATATTTAACAACATCTTGAGAAGACAGATAGGCAC 

1 1 1 1 1 1 1 1 M M M 1 1 1 1 1 II II II M M 1 MIMIIII MMMMIMM II II 

TGAGGGAAAAAAAGATGTGACCC^GATATTCAACAACATCCTGAGAAGACAGATTGGTAC 


332 


Db 


417 


476 


Qy 


333 


TCGGAGTCCTACTGTGGAGTATATTAGTGCTCATCCTCATATCCTGTTTATGCTCCTCAA 

Ml MIIIIIMI Mill II III 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 i 1 ! 1 1 Mill 

ACGGTGTCCTACTGTCGAGTAC^TC^GTTCTCATCCTCACATCCTGTTTATGCTTCTCAA 


392 


Db 


477 


536 


Qy 


393 


AGGATATGAAGCCCCACAGATTGCCTTACGTTGTGGGATTATGCTGAGAGAATGTATTCG 

III IMIIIII 1 1 1 1 f 1 1 1 f 1 1 1 1 MIMMIMIIII Mill IIIMIII 

AGG CTATGAAG CC C CACAGATTGCCTTACG CTGTGGGATTATG CTAAGAGAGTGTATTCG 


452 


Db 


537 


596 


Qy 


453 


AC^VTGAACCACTTGCCAAAATCATCCTCTTTTCTAATCAATTCAGAGATTTCTTTAAGTA 

IIMM IIIIIIIMIIIMIIIIII M II 1 M II II 1 1 1 ! 1 1 1 1 1 1 1 1 1 Mill 

A(^TGAGCC^CTTGCC^W^TCATCCTATTTTCTAATCAGTT(^GAGATTTCTTC^AGTA 


512 


Db 


597 


656 


Qy 


513 


CGTGGAGTTGTC^CATTTGATATTGCTTCAGATGCCTTTGCTACTTTCAAGGATTTACT 

II Ml MM II IMIIIII MIMMMIMM IIIMIII III 

TGTTGAGCTGTCCACCTTTGATATCGCTTCAGATGCCTTCGCTACTTTTAAG 


572 


Db 


657 


708 



Qy 


573 


AACC^GACATAAAGTGTTGGTAGC^GACTTCTTAGAACAAAATTACGACACTATTTTTGA 

MINIM 


632 


Db 


709 


716 


Qy 
Db 


633 
717 


AGACTATGAGAAATTG CTTCAGT CTGAGAATTATGTTACTAAGAGACAGTCTTTAAAGCT 

MUM MM II 1 1 1 1 1 1 1 1 Mill II MINIM MINIMI 1 

AGACTATGAGAAACTG CTG CAATCTGAGAACTATGTGACAAAGAGACAATCTTTAAAGTT 


692 
776 


Qy 


693 


G CTAGGGGAG CTGATCCTGGACCGT CACAACTTTG C CATCATGACAAAGTATATCAG CAA 

MINI Mi l, INN N II II INN IN 1 MM M M 

GCTAGGTGAGCTGATCCTGGACCGCCACAATTTCACCATTATGACCAAG - - TATCAGCAA 


752 


Db 


777 


834 


Qy 


753 


GCCGGAGAACCTGAAACTCATGATGAACCTCCTTCGGGATAAAAGTCCCAACATCCAGTT 

IN .I'M MMMMMI INN N MMIUIMMMIM N 

GCC^GAGAACCTGAAACTGATGATGAACCTGCTTCGAGACAAAAGTCCCAACATCCAATT 


812 


Db 


835 


894 


Qy 


813 


TGAAGCCTTTCATGTTTTTAAGGTGTTTGTGGCCAGTCCTCACAAAACACAGCCTATTGT 

NINNI INN MMMMMMMMMM! II MM 1 llllllll II 

CGAAGCCTTCCATGTCTTTAAGGTGTTTGTGGCCAGCCCCCACAAAACGCAGCCTATCGT 


872 


Db 


895 


954 


Qy 


873 


GGAGATCCTGTTAAAAAATCAGCCOU^CTCTVTTGAGTTTCTGAGCAGCTTCCAAAAAGA 

MINI 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 II 1 1 1 N 1 1 1 1 1 1 1 II 1 1 II INN 

GGAGATTCTGTTAAAAAATCAGCCCAAACTCATTGAGTTTCTGAGCAGCTTTCAGAAAGA 


932 


Db 


955 


1014 


Qy 


933 


^Lrt.oo.rt.^vji^j-H. i i lxhajUMIj 1 1 i LIjALt AALrAA L 1 ACTTGATTAAACAGATCCGAGA 

[INN II II llllllll 1 1 1 1 1 1 1 1 1 i 1 1 r r f 1 1 f II 1 1 1 M 1 1 M 1 1 INN 

AAGGACAGACGACGAGCAGTTTGCTGACGAGAAGAACTACCTGATTAAACAGATTCGAGA 


992 


Db 


1015 


1074 


Qy 


993 


CTTGAAGAAAACGGCCCCTTGA 1014 

1,111 1 Mill Ml 

CTTGAAGAAAG CAG CC CCGTGA 1096 




Db 


1075 





BU116522 951 bp mRNA linear EST 25-NOV-2002 

603139786F1 CSEQCHL15 Gallus gallus cDNA clone ChEST129122 5', mRNA 
sequence. 
BU116522 

BU116522 . 1 GI : 253234 02 
EST. 

Gallus gallus (chicken) 
Gallus gallus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Archosauria; Aves ; Neognathae; Galliformes ; Phasianidae; 
Phasianinae; Gallus . 
1 (bases 1 to 951) 

Boardman, P.E. , Sanz-Ezquerro , J . , Overton, I .M. , Burt,D.W., Bosch, E., 

Fong,W.T., Tickle, C, Brown,W.R.A. , Wilson, S . A. and Hubbard, S.J 

A Comprehensive Collection of Chicken cDNAs 

Curr. Biol. 12 (22), 1965-1969 (2002) 

22335534 

12445392 

Contact : Simon Hubbard 

Department of Biomolecular Sciences 

University of Manchester Institute of Science and Technology (UMIST 



RESULT 11 

BU116522 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
MEDLINE 
PUBMED 
COMMENT 



PO Box 88, Manchester, M60 1QD, UK 
Tel: 01612008930 
Fax: 01612360409 

Email : Simon.Hubbard@umist.ac.uk. 
FEATURES Location/Qualifiers 
source 1 . . 951 

/organism="Gallus gallus" 
/mol_type=" mRNA " 
/strain="Compton Line 151" 
/db_xref = n taxon: 9031" 
/ c 1 one= " ChESTl 2 9122" 
/sex=" Female" 
/tissue_type="cerebrum" 
/devest age= " adul t " 
/lab_host="DH10B" 
/clone_l ib= " CSEQCHL1 5 " 

/not e= "Organ: brain; Vector: pBluescript II KS( + ); Site__l : 
EcoRI; Site_2: Notl; Modification of pBluescript II KS (+) 
[Stratagene] vector to accommodate cDNA produced with the 
T-trimmed protocol (Construction of uni-directionally 
cloned cDNA libraries from messenger RNA for improved 3 1 
end DNA sequencing by Glenn Fu # et al . U.S. Patent # 6,387 
,624). Cut pBluescript II KS(+) with Notl and EcoRI. 
Ligate in double stranded adaptor containing Bsgl and 
BamHI sites [5 'ggccgcgtgcagccccggatccgaaaaaaag] 
[5 'aattctttttttcggatccggggctgcacgc] " 

BASE COUNT 303 a 206 c 2 02 g 24 0 t 

ORIGIN 

Query Match 60.6%; Score 614; DB 13; Length 951; 

Best Local Similarity 82.8%; Pred. No. 1.3e-122; 

Matches 737; Conservative 0; Mismatches 150; Indels 3; Gaps 3; 
Qy 121 GAAGTGTCTAAATC^CTGCAAGCAATGAAAGAAATTC 180 

MINIM Mill MINIMUM II MMMMIMM II I II MM 

Db 6 GAAGTGT CAAAATCTCTG CAAG CAATGAAGTAAATTCTGTGTGGGAC CACAGACAAGGAG 65 

Qy 1 8 1 CCCCCAACAGAAGCT^GTGGCTCAGCTAGCACAAGAACTCTA 24 0 

1 1 N MM Ml I Ml III I INI Mill 1 1 II I INI NUN IN MINI 

Db 66 CCA.CCGACA.GAAGTAGTGGCTCAGCTGGC^ 125 

Qy 241 ACACTGATAG CTGACCTG CAGCTGATAGACTTTGAGGGAAAAAAAGATGTGAC CCAGATA 300 

10 INN II II II I INN II INN INI I IN INN INN I III Nil 

Db 12 6. ACACTTATTGCC^CCTGC^GCTC^TAGATTTTGAGGGTAAAAAGGATGTTTCCaVGATA 185 

Qy 301 TTTAACAACATCTTGAGAAGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGT 360 

IN II II Mill II I II III II II Mill M || MINI INI I II WWW 

Db 186 TTTAACAACATCCTGAGAAGACAAATTGGCACAC^ 245 

Qy 361 GCTCATCCTCATATCCTGTTTATGCTCCTCAAAGGAT^ 42 0 

0 N 1 1 1 1 1 1 1 1 f M II 1 1 1 Mill II 11 1 1 1 MINI 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 

Db 246 GCCCATCCAC^TATCCTGTTC^TGCTTCTGAAAGGCTATGAATCCCCAAATATTGCCTTA 305 

Qy 421 CGTTGTGGGATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAA-AATCATCCT 47 9 

N M I I I II II I I II I II II II II II II I I II I N I I I I I II I I II II N 
Db 306 CGCTGTGGAATTATGCTGAGGGAGTG(^TCCGACATGAACCATTGGCC^CAATCATACT 365 



Qy 


480 


CTTTTCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGC 
i i i i i i ii i i i i i i i i 

Hill 1 II MINIM INN IMM MMI MMMMMMMM M 

TTTTT(^GAA.CAGTTCAGAGACTTCTTCAAGTATGTGGAAATGTCAACATTTGATATAGC 


539 


Db 


366 


425 


Qy 


540 


TTCAGATGCCTTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGA 

II \ 1 1 i 1 1 J 1 1 1 1 i i r ci ii 

M MIMMMIMM M M M II II II II II II II MIMIMIIIII 

ATCTGATGCCTTTGCTACATTCAAGGACTTGTTAACAAGGCACAAGTTGTTGGTAGCAGA 


599 


Db 


426 


485 


Qy 


600 


CTTCTTAGAACAAAATTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGA 

11 1 1 1 1 i i i i i i i i i i i ■■ > 

M 1 M 1 1 1 M II II II II II IMM II INN III | MMI IMM 

TTTTATGGAACAAAATTATGATACGATCTTTGAGGATTATGAAAAACTCCTTCATTCTGA 


659 


Db 


486 


545 


Qy 


660 


GAATTATGTTACTAAGAGACAGTCTTTAAAGCTGCTAGGGGAGCTGATCCTGGACCGTCA 

N 1 1 1 1 M II Ml IIIIMIMM IMIIIII II II INI II III 1 II 

GAATTACGTAACAAAGAGACAGTCTTTGAAGCTGCTGGGTGAATTGATTCTAGACAGACA 


719 


Db 


546 


605 


Qy 


720 


CAACTTTGCCATCATGACAAAGTATATCAGCAAGCCGGAGAACCTGAAACTCATGATGAA 

MINI 1 1 1 1 M 1 1 1 11 1 1 1 INNNINI II INM IMM II MINIM 

CAACTTCGCCATCATGACAAAATATATCAGCAAACCAGAGAATCTGAAGCTGATGATGAA 


779 


Db 


606 


665 


Qy 


780 


CCTCCTTCGGGATAAAAGTCCCAACATCCAGTTTGAAGCCTTTCATGTTTTTAAGGTGTT 

1 1 N N N INN NINNI II III! || IMM II Mill II 

CTTGCTGCGAGACAAAAGCCCCAACATTCAATTTGAAGCATTCCATGTGTTCAAGGTTTT 


839 


Db 


666 


725 


Qy 


840 


TGTGGCCAGTCCTCACAAAACACAGCCTATTGTGGAGATCCTGTTAAAAAATCAGCCCAA 

NINIIIIIII MINN INN II NINNINN 1 INN III Nil 

TGTGGCCAGTCCAAACAAAACTCAGCCCATCGTGGAGATCCTGCTGAAAAACCAG - CCAA 


899 


Db 


726 


784 


Qy 


900 


ACTCATTGAGTTTCTGAGCAGCTTCCA-AAAAGAAAGGACGGATGATOAnrArTTrrr'Tr 

INM IMIIIII Mill Ml II MM Ml IMIIIII 1 1 

GCTCATCGAGTTTCTGAGCCATTTCCAGAAACGAGAGGACGGTTGACGAGCAGTTCACCG 




Db 


785 


844 


Qy 


959 


ACGAGAAGAACTACTTGATTAAACAGATCCGAGACTTGAAGAAAACGGCC 1008 

MMMMMIMI MM II M MMIIIIIIIMIIII 1 1 1 

ACGAGAAGAACTACCTGATCAAGCAAATCCGAGACTTGAAGAAGGCCGAC 8 94 




Db 


845 





RESULT 12 

BQ669953 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



linear EST 15-JUL-2002 



BQ669953 982 bp mRNA _ a 

AGENCOURT_8203755 NIH_MGC_102 Homo sapiens cDNA clone IMAGE : 6255924 
5', mRNA sequence. 
BQ669953 

BQ669953 . 1 GI : 2178 0787 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 
1 (bases 1 to 982) 
NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished 

Contact: Robert Strausberg, Ph.D. 
Email : cgapbs-r@mail .nih.gov 
Tissue Procurement: ATCC 
cDNA Library Preparation: Rubin Laboratory 



Craniata ; Vert ebrata ; Euteleostomi ; 
Catarrhini; Hominidae; Homo. 



cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 
DNA Sequencing by: Agencourt Bioscience Corporation 
Clone distribution: MGC clone distribution information can be 
found through the I.M.A.G.E. Consort ium/LLNL at: 
http: //image. llnl .gov 
Plate: LLCM24 07 row: m column: 13 
High quality sequence stop: 5 08. 
FEATURES Loca t ion/Qual i f iers 

source 1. .982 

/organism^ "Homo sapiens" 
/mol_t ype- 11 mRNA " 
/db_xref = 11 1 axon: 9606" 
/clone=" IMAGE: 62 55924" 

/tissue_type=" epidermoid carcinoma, cell line" 
/lab_host-"DH10B (phage-resistant ) " 
/ c 1 one_l ib= " N I H_MGC_ 102" 

/note="0rgan: salivary gland; Vector: pOTB7; Site_l: Xhol ; 
Site_2: EcoRI; cDNA made by oligo-dT priming. 
Directionally cloned into EcoRI/XhoI sites using the 
following 5' adaptor: GGCACGAG(G). Library constructed 
by Ling Hong in the laboratory of Gerald M. Rubin 
(University of California, Berkeley) using ZAP-cDNA 
synthesis kit (Stratagene) and Superscript II RT (Life 
Technologies) . Note: this is a NIH MGC Library " 

BASE COUNT 217 a 200 c 357 g 197 t ll~others 

ORIGIN 



Query Match 58.6%; Score 594.2; DB 13; Length 982; 

Best Local Similarity 96.5%; Pred. No. 2.6e-118; 

Matches 628; Conservative 0; Mismatches 20; Indels 3; Gaps 



Qy 


334 


CGGAGTCCTACTGTGGAGTATATTAGTGCTCATCCTCATATCCTGTTTATGCTCCTCAAA 

IIIMIMIIIIMIIII IIIIIIIIMIMIMIIIIIIIMI MIIIIIIIIMM 

CGGAGTCCTACTGTGGAG-ATATTAGTGCTCATCCTCATATCCTGGTTATGCTCCTCAAA 


393 


Db 


1 


59 


Qy 


. 394 


GGATATGAAG CCCCACAGATTG C CTTACGTTGTGGGATTATG CTGAGAGAATGTATTCGA 

Ml 1 1 1 M 1 M 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 Ml II II Mill II Ml IN Ml MUM 

GGATATGAAG CCCCACAGATTG C CTTACATTGGGGGATTATG CTGAGAGAATGGATTCGA 


453 


Db 


60 


119 


Qy 


454 


CATGAACCACTTGCCAAAATCATCCTCTTTTCTAATCAATTCAGAGATTTCT 

1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 II II II II 1 1 1 M 1 1 1 II II 1 1 1 1 1 

CATGAACCACTTGCC^WU^TCATCCTCTTTTCTAATCAATTC^GAGATTTCTTTAAGTAC 


513 


Db 


120 


179 


Qy 


514 


GTGGAGTTGTO^CATTTGATATTGCTTCAGATGCCTTTGCTACTTTCAAGGATTTACTA 

1 1 1 1 1 1 1 M f I M 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 M 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 

GTGGAGTTGTCAACATTTGATATTGCTTCAGATGCCTTTGCTACTTTCAAGGATTTACTA 


573 


Db 


180 


239 


Qy 


574 


ACCAGACATAAAGTGTTGGTAGCAGACTTCTTAGAACAAAATTACGACACTATTTTTGAA 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1| 1 1| 1 1 1 1 M 1 1 1 1 1 II 

ACCAGACATAAAGTGGTGGTAGCAGACTTCTTAGAACAAAATTACGACACTATTTTTGAA 


633 


Db 


240 


299 


Qy 


634 


GACTATGAGAAATTGCTTCAGTCTGAGAATTATGTTACTAAGAGACAGTCTTTAAAGCTG 

L 1 ' 1 M 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 f f f M 1 1 1 1 1 1 1 1 1 M 1 1 M 1 1 1 1 1 1 1 1 1 1 1 [ 

GACTATGAGAAATTG CTTCAGTCTGAGAATTATGGTACTAAGAGACAGTCTTTAAAG CTG 


693 


Db 


300 


359 


Qy 


694 


CTAGGGGAGCTGATCCTGGACCGTCACAACTTTGCCAT^ 

N 1 Ml 1 1 IN IIMI i II MMIliM III II II 1 IMM Ml III Ml 1 1 II i III 1! 


753 



Db 


360 


Qy 


754 


Db 


420 


Qy 


814 


Db 


480 


viy 


O / *± 


Db 


540 


Qy 


934 


Db 


600 



360 CTAGGGGAGCTGATCCTGGACCGTra 419 
CCGGAGAACCTGAAACTCATGATGAACCTCCTTCGGGATAAAAGTCCCAACATCCAGTTT 813 

MIIIIIMIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIlllllMIIIIIIIIIIIII 

CCGGAGAACCTGAAACTCATGATGAACCTCCTTCGGGATAAAAGTCCCAACATCCAGTTT 4 7 9 
GAAGCCTTTCATGTTTTTAAGGTGTTTG 8 73 

MM Mill II 1 1 lllllill I llllllllll II II MM I III Mill III I MM 

GAAGCCTTTCATGGTTTTAAGGGGGTTGTGGC^GTCCTC^QU^C^C^GCCTATTGTG 53 9 



GAGATCCTGTTAAAAAATCAGCCCAAACT(^TTGAGTTTCTGAG(^GCTTCCAAAAAGAA 

"I I I I II I II II I II I I II II II I I I II II I I I II I II II I II I I M II II I 

GAGATCCTGGTAAAAAATCAGCCOy^CTC^TTGAGTTTCTGAGCAGCTTCCAAAAAGA^ 

AGG - - ACGGATGATGAGCAGTT CG CTGACGAGAAGAACTACTTGATTAAAC 982 

III M II I I II II II M llllllllll lllllill | Ml 

AGGGACGGGATGATGAGCANNTCCCTGACGAGAAAGACTACTTGGGTTAAC 650 



933 



599 



RESULT 13 

BU518807 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



BU518807 934 bp mRNA linear EST 12-SEP-2002 

AGENCOURT_l 0171 930 NIH_MGC_134 Mus musculus cDNA clone 
IMAGE: 6516567 5', mRNA sequence. 
BU518807 

BU518 8 07. 1 GI : 22 82 6333 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodent ia ; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 934) 
NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished 

Contact: Robert Strausberg, Ph.D. 
Email : cgapbs -r@mail .nih.gov 
Tissue Procurement: Dr. David Rowe 
cDNA Library Preparation: Invitrogen Corp 

cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 
DNA Sequencing by: Agencourt Bioscience Corporation 
Clone distribution: MGC clone distribution information can be 
found through the I.M.A.G.E. Consort ium/LLNL at: 
http: // image . llnl .gov 
Plate: LLAM14095 row: e column: 16 
High quality sequence stop: 656. 
Loca t ion/Qual i f iers 
1. .934 

/organism="Mus musculus" 
/mol__type = " mRNA " 
/db_xref ="taxon: 10090" 
/clone=" IMAGE: 6516567" 
/tissue_type=" undifferentiated limb" 
/lab_host="DH10B (phage-resistant ) " 
/clone__lib="NIH_MGC_134" 

/note= "Vector: pCMV-SP0RT6 . 1 . ccdb; Site_l : EcoRV; Site_2 : 
Not I; Cloned unidirect ionally . Primer: Oligo dT. Average 



insert size 1.7 kb. Constructed by ResGen, Invitrogen 

Corp. Note: this is a NIH_MGC Library " 

BASE COUNT 3 01 a 198 c 200 g 234 t 1 others 

ORIGIN 

Query Match 57.7%; Score 585.2; DB 13; Length 934; 

Best Local Similarity 89.0%; Pred. No. 2.3e-116; 

Matches 654; Conservative 0; Mismatches 79; Indels 2; Gaps 2; 

TGAAAAAAATGCCTTTGTTTAGTAAATCACACAAAAATCCAGCAGAAATTGTGAAAATCC 6 1 

NIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMMIIIMI Mill I 

TGAAAAAAATGCCTTTGTTTAGTAAATCACACAAAAATCCAGCAGAAATTGTCAAAATTC 172 
TGAAAGACAATTTGGCCATTTTGGAAAAGCAAGACAAAAAGACAGACAAGGCTTCAGAAG 121 

Mllllllll iMIMMIM IIMIIMIMMIflMlllf IIIMMIIIIMI 

TGAAAGACAACCTGGCCATTTTGGAAAAGCAAGACAAAAAGACAGACAAGGCTTCAGAAG 232 
AAGTGTCTAAATCACTGCAAGCAATGAAAGAAATTCTGTGTGGTACAAACGAGAAAGAAC 181 

I Hill INN llllllllllllll I M I I il 1 1 mil II M I 

AGGTGTCAAAATCTCTGO^GCAATGAAGGAAATTCTGTGTGGAACGAACGACAAGGAGC 2 92 
CCCCAACAGAAGCAGTGGCTCAGCTAGCACAAGAACTCTACAGCAGTGGCCTGCTAGTGA 241 

INI Mllllllll Mill II II II Mill 1 1 Ml I II MM MM 



^ x j- ^ ^-rto ^ i kj±\ i xii OAiaULjAAAAAAAGA I Cj TGACCCAGATAT 301 

UN 1 1 1 M 1 1 1 1 1 1 1 1 1 1 M I II II 1 1 M II 1 1 1 1 1 1 II 1 1 II I II I II II II II II 

CACTCATAGCTGACCTGCAGCTCATAGACTTTGAGGGAAAAAAAGATGTGACCCAGATA'T 412 

TTAACAACATCTTGAGAAGACAGATAGGCACTCGGAGTCCTACTGTGGAGTATATTAGTG 361 
I 1 1 I I ( f I I I MM Mil || II III Mllllllll Mill II III 
TCAACAACATCCTGAGAAGACAGATTGGTACACGGTGTCCTACTGTCGAGTACATCAGTT 4 72 

CTCATCCTCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTAC 421 

IIIHIMM llllllllllllll M M MM I, I II I II hM 

CTCATCCT(2A(^TCCTGTTTATGCTTCTCAAAGGCTATGAAGCCCCACAGATTGCCTTAC 532 

GTTGTGGGATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTCT 4 81 
I llllllllllllll Mill I II I I II II I I I II | M | I I M II I I I I II I I || | 
GCTGTGGGATTATGCTAAGAGAGTGTATTCGACATGAGCCACTTGCCAAAATCATCCTAT 592 

TTTCTAATCAA.TTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCTT 541 

Mllllllll M M I II I II I II I Mill II III MM II 1 1| || || | MM 

TTTCTAATCAGTTCAGAGATTTCTTCAAGTATGTTGAGCTGTCCACCTTTGATATCGCTT 652 
CAGATGCCTTTGCTACTTTCAAGGATTTACTAACCAGACATAAAGTGTTGGTAGCAGACT 601 

1 1 1 1 1 1 1 1 1 1 MM MIIIMI lid. ,|| MIMMMIMI 

CAGATGCCTTCGCTACTTTTAAGGATTTGTTAACCAGACATAAAGTATTGGTAGCAGACT 7 1 2 
TCTTAGAACAAAATTACGACACTATTTTTGAAGACTATGAGAAATTGCTTCAGTCTGAGA 661 

I MMI M III I MM MM II II I III II Ml II II MIMI MM II II Ml 



I Mill II Mill III IMIMIM IMIIIII II Ml i | | | 

^.CTATGTGACAAAGAGAACATTCTTTAAAGTTGCTAGGGTGAGCTGATCCCTGGACCGCC 832 



Qy 


2 


Db 


113 


Qy 


62 


Db 


173 


Qy 


122 


Db 


233 


Qy 


182 


Db 


293 


Qy 


242 


Db 


353 


Qy 


302 


Db 


413 


Qy 


362 


Db 


473 


Qy 


422 


Db 


533 


Qy 


482 


Db 


593 


Qy 


542 


Db 


653 


Qy 


602 


Db 


713 1 


Qy 


662 j 


Db 


773 a 



Qy 72 0 CAACTTTGCCATCAT 734 

II II II III 

Db 833 CACAATTTTCACCAT 847 



RESULT 14 

CD354831 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



Craniata ; Vert ebrata ; Eut eleostomi ; 
Sciurognathi; Muridae; Murinae; Mus . 



CD354831 713 bp mRNA linear EST 29-MAY-2003 

UI-M-GM0-cge-i-10-0-UI .rl NIH_BMAP_GM0 Mus musculus cDNA clone 
IMAGE: 30361641 5', mRNA sequence. 
CD354831 

CD354831.1 GI:31147332 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Rodent ia; 
1 (bases 1 to 713) 
NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished 

Contact: Robert Strausberg, Ph.D. 
Email : cgapbs-r@mail .nih.gov 

Tissue Procurement: Dr. Jim Lin, University of Iowa 
cDNA Library preparation: Dr. M . Bento Soares, University of Iowa 
cDNA Library Arrayed by: Dr. M. Bento Soares, University of Iowa 
DNA Sequencing by: Dr. M. Bento Soares, University of Iowa 
Clone Distribution: Distribution information can be found at 

http: //genome .uiowa . edu/distribution/mousef 1 .html 
This clone was contributed by the Brain Molecular Anatomy Project 

(BMAP) 

Seq primer: pYX-5. 

Location/Qualifiers 
1. .713 

/organism="Mus musculus" 
/mol_type="mRNA" 
/strain= " C57BL/ 6 " 
/db_xref ="taxon: 10090" 
/clone=" IMAGE: 3 03 61641" 
/tissue_type= "whole brain" 
/dev_stage="l, 5 and 15 days newborn" 
/lab_host="DH10B (Tl phage resistant) " 
/clone_l ib= " N I H_BMAP_GM 0 " 

/note= "Organ: Brain; Vector: pYX- Asc; Site_l: EcoR I; 
Site_2: Not I; The library was constructed according 
Bonaldo, Lennon and Soares, Genome Research, 6:791-806, 
1996. Denatured RNA was size fractionated on a 1% agarose 
gel. First strand cDNA synthesis was primed with oligo-dT 
primer containing a Not I site. Double strand cDNA was size 
selected according to mRNA size f ract ion, ligated with EcoR 
I adaptor, digested with Not I and then cloned 
direct ionally into pYX-Asc vector. The library tag 
sequence located between the Not I site and the polyA tail 
is CGAACTGAAT. This library was created for the University 
Iowa Brain Anatomy Project (BMAP) : 'Gene Discovery in the 
Developing Mouse Nervous System', supported by National 
Institute of Mental Health (NIMH), Hemin Chin, Ph.D., 



BASE COUNT 
ORIGIN 



program coordinator." 
220 a 159 c 140 g 192 t 



2 others 



Query Match 57.1%; Score 579.2; DB 14; Length 713; 

Best Local Similarity 89.9%; Pred. No. 4.4e-115; 

Matches 642; Conservative 0; Mismatches 70; Indels 2; Gaps 2; 

AAAAGATGTGACCCAGATATTTAACAACAT - CTTGAGAAGACAGATAGG CACT CGGAGT C 34 0 

MINN N II MINIM I MINIMUM M 1 1 Ml Ml 

AAAAGATGTGACCCAGATATTC^CAACATCCNTGAGAAGACAGATTGGTACACGGTGTC 6 0 
CTACTGTGGAGTATATTAGTGCTCATCCTCATATCCTGTTTATGCTCCTCAAAGGATATG 4 00 

IMIIII Mill M III 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1! 1 1 1 1 1 1 1 1 IIIIIMI MM 

CTACTGTCGAGTACATCAGTTCTCATCCTCACATCCTGTTTATGCTTCTCAAAGGCTATG 12 0 
AAGCCCCACAGATTGCCTTACGTTGTGGGATTATGCTGAGAGAATGTATTCGACATGAAC 460 

IMMIMMIMIMIMM MIMMMMMI Mill 1 1 1 1 1 1 M 1 1 1 1 M I 

AAGCCCCACAGATTGCCTTACGCTGTGGGATTATGCTAAGAGAGTGTATTCGACATGAGC 180 
(^CTTGCOVAAATCATCCTCTTTTCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGT 52 0 

MMMMMIMIIMI IMIMIMM MIMMMMMI Mill II III 

CACTTGCCAAAATCATCCTATTTTCTAATCAGTTCAGAGATTTCTTCAAGTATGTTGAGC 24 0 
TGTOUVCATTTGATATTGCTTCAGATGCCTTTGCTACTTTCAAGGATTTACTAACCAGAC 58 0 

MM II IIIIIMI I ! 1 1 1 1 II 1 1 1 1 1 IIIIIMI IIIIIMI IIIIIMI 

TGTCCACCTTTGATATCGCTTCAGATGCCTTCGCTACTTTTAAGGATTTGTTAACCAGAC 3 00 
ATAAAGTGTTGGTAGCAGACTTCTTAGAACAAAATTACGACACTATTTTTGAAGACTATG 64 0 

IMIIII l I I II il I I I II 1 1 ! 1 1 ! 1 1 1 1 ! I 

ATAAAGTATTGGTAGCAGACTTCTTAGAACAAAATTATGACACTATTTTTGAAGACTATG 3 6 0 
AGAAATTG CTTCAGT CTGAGAATTATGTTACTAAGAGACAGTCTTTAAAG CTG CTAGGGG 70 0 

Mill MM II IIIIIMI Mill II IIIIIMI I II IMIIII I 

AGAAACTGCTGCAATCTGAGAACTATGTGACAAAGAGACAATCTTTAAAGTTGCTAGGTG 420 
AGCTGATCCTGGACCGTCACAACTTTGCC^TCATGAOW^GTATATCAGCAAGCCGGAGA 760 

II II I II 1 1 II I II 1 1 Mill II MM Mill I IN N IN I MM 

AG CTGAT C CTGGACCG CCACAATTTCAC CATTATGAC CAAGTATATCAG CAAG C CAGAGA 48 0 
ACCTGAAACTCATGATGAACCTCCTTCGGGATAAAAGTCCCAACATCCAGTTTGAAGCCT 820 

1 1 1 1 1 1 i 1 1 1 IMMMIMI Mill II 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 II IMIIII 

ACCTGAAACTGATGATGAACCTGCTTCGAGAC?WUVGTCCCAAC^TCCAATTCGAAGCCT 54 0 
TTCATGTTTTTAAGGTGTTTGTGGCCAGTC 880 

I Mill 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 II IIIIIMI IIIIIMI IIIIIMI I 

TCCATGTCTTTAAGGTGTTTGTGGCCAGCCCCCACAAAACGCAGCCTATCGTGGAGATTC 600 
TGTTAAAAAATC^GCCCAAACTCATTGAGTTTCTGAGCAGCTTCCAAAAAGAAAGGACGG 94 0 

i 1 1 M 1 1 1 : 1 1 1 II 1 1 h I M 1 1 M 1 1 1 1 M M I ! I M 1 1 1 1 II MINIMI I 

TGTTAAAAAATCAG CCCAAACT CATTGAGTTTCTGAG CAG CTTT CAGAAAGAAAGGACAG 660 
ATGATGAGCAGTTCGCTGACGAGAAGAACTACTTGATTAAACAGATCCGAGACT 994 

I II IIIIIMI I II i Mill MIMI MUM IMIIII 



Qy 


282 


Db 


1 


Qy 


341 


Db 


61 


Qy 


401 


Db 


121 


Qy 


461 


Db 


181 


Qy 


521 


Db 


241 


Qy 


581 


Db 


301 


Qy 


641 


Db 


361 


Qy 


701 


Db 


421 


wy 


/ D X 


Db 


481 


Qy 


821 


Db 


541 


Qy 


881 


Db 


601 


Qy 


941 


Db 


661 



RESULT 15 
HSM073180 

ID HSM073180 standard; RNA; EST; 742 BP . 
XX 

AC BX483012; 
XX 

SV BX483012.1 
XX 

DT 09-MAY-2003 (Rel . 75, Created) 

DT 09-MAY-2003 (Rel. 75, Last updated, Version 1) 
XX 

DE Homo sapiens mRNA; EST DKFZp686C08234_rl (from clone DKFZp686C08234 ) 
XX 

KW EST; expressed sequence tag. 
XX 

OS Homo sapiens (human) 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; Mammalia; 

OC Eutheria; Primates; Catarrhini; Hominidae; Homo. 

XX 

RN [1] 
RP 1-742 

RA Ottenwaelder B., Obermaier B. # Deutschenbaur S. # Mewes H.W., Weil B., 

RA Amid C. , Osanger A. , Fobo G., Han M. , Wiemann S.; 

RT 

RL Submitted ( 07-MAY-2 003 ) to the EMBL/ GenBank / DDB J databases. 
RL MIPS, Ingolstaedter Lands tr.l, D-85764 Neuherberg, GERMANY 
XX 

CC This is the 5' sequence of the clone insert 

CC Clone from S. Wiemann, Molecular Genome Analysis, German Cancer 
CC Research Center (DKFZ) ; Email s.wiemann@dkfz~heidelberg.de; 
CC sequenced by MediGenomix (Mart insried/Germany) within the cDNA 
CC sequencing consortium of the German Genome Project. 
CC No si sequence available. 

CC This clone (DKFZp686C08234 ) is available at the RZPD in Berlin. 
CC Please contact the RZPD: Ressourcenzentrum, Heubnerweg 6, 
CC 14059 Berlin-Charlottenburg, GERMANY; Email: clone@rzpd.de 
XX 

FH Key Location/Qualifiers 
FH 

FT source 1. .742 

FT /db_xref ="taxon: 9606" 

FT / mo 1 _ t yp e = " mRNA " 

FT /organism="Homo sapiens" 

FT /clone="DKFZp686C08234" 

FT /clone_lib="68 6 (synonym: hlcc3) . Vector pSportl_Sfi; host 

FT DH10B; sites SfilA + Sf ilB" 

FT /devest age= "adult" 

FT /tissue_type="cDNA- col lection" 

XX 

SQ Sequence 742 BP; 256 A; 143 C; 162 G; 179 T; 2 other; 

Query Match 57.0%; Score 578; DB 2; Length 742; 

Best Local Similarity 99.8%; Pred. No. 8e-115; 

Matches 578; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy i atgaaaaaaatgcctttgtttagtaaatc^cacaaaaatcc^^ 60 

I INI li MM II Mi I II IIMMIM 1 1 1 1 1 ! 1 1! I II I II 1 1 Ml MM 1 1 MM 



Db 


164 


Ov 


61 


Db 


224 


Ov 


121 


Db 


284 


Ov 


181 


Db 


344 


Ov 


241 


Db 


404 


Ov 


301 


Db 


464 


Qv 


361 


Db 


524 


Ov 


421 


Db 


584 


Qy 


481 


Db 


644 


Qy 


541 


Db 


704 



ATGAAAAAAATG C CTTTGTTTAGTAAATCACACAAAAAT C CAG CAGAAATTGTGAAAATC 223 
CTGAAAGACAATTTGGC CATTTTGGAAAAG CAAGACAAAAAGACAGACAAGG CTTCAGAA 12 0 

IMIIMIIIIMIM MIMIIMMIMIMIIMMI Ml MIMIIIMIIIII 

CTGAAAGACAATTTGGCCATTTTGGAAAAGCAAGACAAAAAGACA£^ 283 
GAAGTGTCTAAATCACTGOUVGC^TGAAAGAAATTCTGTGTGGTACAAACGAGAAAG^ 18 0 

Mil 1 1 1 1 1 1 1 1 1 1 1 III I III III ill 1 1 ill 1 1 1 1 1 1 1! 1 1 1 1 1 1 1 II 1 1 1 1 1 II 1 1 

GAAGTGT CTAAAT CACTG CAAG CAATGAAAGAAATT CTGTGTGGTACAAACGAGAAAGAA 343 

CCCCCAACAGAAGCAGTGGCTCAGCTAGCACAAGAACTCT^ 24 0 

I I I I I i I I 1 I I I I I I I 1 I I 1 I 1 1 I I 1 I I I I 1 1 I I 1 I 1 I I I 1 I I 1 I I I I 1 1 I I 1 I 1 1 1 I I I 

CCCCCAACAGAAGCAGTGGCTCAGCTAGCACAAGAACTCTACAGCAGTGGCCTC 4 03 

ACACTGATAG CTGACCTG CAG CTGATAGACTTTGAGGGAAAAAAAGATGTGAC CCAGATA 3 0 0 

1 1 1 1 1 1 M 1 1 IM MM I M I M 1 1 1 1 1 M II I II I II M 1 1 1 M M I Ml I M II 1 1 1 II 

ACACTGATAG CTGAC CTG CAG CTGATAGACTTTGAGGGAAAAAAAGATGTGAC CCAGATA 4 63 



1 1 ,1 ! 1 1 !l I II 1 1 1 1 1 1 II 1 1 1 1 1 ,1 1 MM II i 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 ! I! il I M 

TTTAACAACATCTTGAGAAGACAGATAGG CACT CGGAGT CCTACTGTGGAGTATATTAGT 523 
GCTCATCCTCATATCCTGTTTATGCTCCTCAAAGGATATGAAGC^ 42 0 

1 1 1 1 >j i! 1 1 1 1 1 1 1 1 m i ! 1 1 1 1 1 mi 1 1 1 1 1 ; 1 1 1 1 M i 1 1 1 1 ! 1 1 1 1 1 1 1 ! II 

GCTCATCCTCATATCCTGTTTATGCTCCTCAAAGGATATGAAGCCCCACAGATTGCCTTA 583 
CGTTGTGGGATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTC 48 0 

i i i I ! 1 1 1 1 1 i 1 1 ! 1 1 1 1 ' i : 1 i I , I ' i S 1 1 1 1 1 ! i 1 1 1 1 Mill , 1 1 1 1 

CGTTGTGGGATTATGCTGAGAGAATGTATTCGACATGAACCACTTGCCAAAATCATCCTC 643 
TTTTCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGTTGTCAACATTTGATATTGCT 54 0 

MM I Ml Ml MMM Ml Ml II 1 1 1 II 1 1 1 1 IMi I M I M I M II M 1 1 1 1 II I 

TTTTCTAATCAATTCAGAGATTTCTTTAAGTACGTGGAGNTGTCAACATTTGATATTGCT 703 
TCAGATGCCTTTGCTACTTTCAAGGATTTACTAACCAGA 579 

MM 1 1 1 II I II I II 1 1 1 II 1 1 1 1 ,1 1 1 UN III II! 

TCAGATGCCTTTGCTACTTTCAAGGATTTACTAACCAGA 742 



Search completed: January 6, 2004, 03:18:09 
Job time : 2589 sees 



